Overview

Dataset statistics

Number of variables32
Number of observations903653
Missing cells6389257
Missing cells (%)22.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 GiB
Average record size in memory1.4 KiB

Variable types

Categorical11
DateTime1
Unsupported1
Text10
Numeric7
Boolean1
Path1

Alerts

adContent is highly overall correlated with campaign and 4 other fieldsHigh correlation
adwordsClickInfo.adNetworkType is highly overall correlated with channelGrouping and 1 other fieldsHigh correlation
adwordsClickInfo.page is highly overall correlated with channelGrouping and 1 other fieldsHigh correlation
adwordsClickInfo.slot is highly overall correlated with channelGrouping and 1 other fieldsHigh correlation
campaign is highly overall correlated with adContent and 2 other fieldsHigh correlation
channelGrouping is highly overall correlated with adContent and 5 other fieldsHigh correlation
continent is highly overall correlated with subContinentHigh correlation
deviceCategory is highly overall correlated with isMobile and 1 other fieldsHigh correlation
hits is highly overall correlated with pageviewsHigh correlation
isMobile is highly overall correlated with deviceCategory and 1 other fieldsHigh correlation
medium is highly overall correlated with adContent and 5 other fieldsHigh correlation
operatingSystem is highly overall correlated with deviceCategory and 1 other fieldsHigh correlation
pageviews is highly overall correlated with hitsHigh correlation
subContinent is highly overall correlated with continentHigh correlation
visitId is highly overall correlated with adContent and 1 other fieldsHigh correlation
visitStartTime is highly overall correlated with adContent and 1 other fieldsHigh correlation
campaign is highly imbalanced (90.3%)Imbalance
adwordsClickInfo.slot is highly imbalanced (83.9%)Imbalance
adwordsClickInfo.adNetworkType is highly imbalanced (99.6%)Imbalance
conversion is highly imbalanced (90.2%)Imbalance
transactionRevenue has 892138 (98.7%) missing valuesMissing
keyword has 502929 (55.7%) missing valuesMissing
referralPath has 572712 (63.4%) missing valuesMissing
adwordsClickInfo.page has 882193 (97.6%) missing valuesMissing
adwordsClickInfo.slot has 882193 (97.6%) missing valuesMissing
adwordsClickInfo.gclId has 882092 (97.6%) missing valuesMissing
adwordsClickInfo.adNetworkType has 882193 (97.6%) missing valuesMissing
adContent has 892707 (98.8%) missing valuesMissing
transactionRevenue is highly skewed (γ1 = 25.7227026)Skewed
adwordsClickInfo.page is highly skewed (γ1 = 40.17090183)Skewed
fullVisitorId is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-02-20 12:38:37.555919
Analysis finished2024-02-20 12:40:59.337155
Duration2 minutes and 21.78 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

channelGrouping
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size57.6 MiB
Organic Search
381561 
Social
226117 
Direct
143026 
Referral
104838 
Paid Search
 
25326
Other values (3)
 
22785

Length

Max length14
Median length11
Mean length9.8297754
Min length6

Characters and Unicode

Total characters8882706
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOrganic Search
2nd rowOrganic Search
3rd rowOrganic Search
4th rowOrganic Search
5th rowOrganic Search

Common Values

ValueCountFrequency (%)
Organic Search 381561
42.2%
Social 226117
25.0%
Direct 143026
 
15.8%
Referral 104838
 
11.6%
Paid Search 25326
 
2.8%
Affiliates 16403
 
1.8%
Display 6262
 
0.7%
(Other) 120
 
< 0.1%

Length

2024-02-20T07:40:59.380249image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:40:59.462025image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
search 406887
31.0%
organic 381561
29.1%
social 226117
17.3%
direct 143026
 
10.9%
referral 104838
 
8.0%
paid 25326
 
1.9%
affiliates 16403
 
1.3%
display 6262
 
0.5%
other 120
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
a 1167394
13.1%
c 1157591
13.0%
r 1141270
12.8%
i 815098
9.2%
e 776112
8.7%
S 633004
 
7.1%
h 407007
 
4.6%
406887
 
4.6%
O 381681
 
4.3%
g 381561
 
4.3%
Other values (15) 1615101
18.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7165039
80.7%
Uppercase Letter 1310540
 
14.8%
Space Separator 406887
 
4.6%
Open Punctuation 120
 
< 0.1%
Close Punctuation 120
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1167394
16.3%
c 1157591
16.2%
r 1141270
15.9%
i 815098
11.4%
e 776112
10.8%
h 407007
 
5.7%
g 381561
 
5.3%
n 381561
 
5.3%
l 353620
 
4.9%
o 226117
 
3.2%
Other values (6) 357708
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
S 633004
48.3%
O 381681
29.1%
D 149288
 
11.4%
R 104838
 
8.0%
P 25326
 
1.9%
A 16403
 
1.3%
Space Separator
ValueCountFrequency (%)
406887
100.0%
Open Punctuation
ValueCountFrequency (%)
( 120
100.0%
Close Punctuation
ValueCountFrequency (%)
) 120
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8475579
95.4%
Common 407127
 
4.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1167394
13.8%
c 1157591
13.7%
r 1141270
13.5%
i 815098
9.6%
e 776112
9.2%
S 633004
7.5%
h 407007
 
4.8%
O 381681
 
4.5%
g 381561
 
4.5%
n 381561
 
4.5%
Other values (12) 1233300
14.6%
Common
ValueCountFrequency (%)
406887
99.9%
( 120
 
< 0.1%
) 120
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8882706
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1167394
13.1%
c 1157591
13.0%
r 1141270
12.8%
i 815098
9.2%
e 776112
8.7%
S 633004
 
7.1%
h 407007
 
4.6%
406887
 
4.6%
O 381681
 
4.3%
g 381561
 
4.3%
Other values (15) 1615101
18.2%

date
Date

Distinct366
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size6.9 MiB
Minimum1970-01-01 00:00:00.020160
Maximum1970-01-01 00:00:00.020170
2024-02-20T07:40:59.543254image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:59.611436image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

fullVisitorId
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size38.7 MiB
Distinct902755
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size74.9 MiB
2024-02-20T07:41:00.130027image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length31
Median length30
Mean length29.878958
Min length24

Characters and Unicode

Total characters27000210
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique901857 ?
Unique (%)99.8%

Sample

1st row1131660440785968503_1472830385
2nd row377306020877927890_1472880147
3rd row3895546263509774583_1472865386
4th row4763447161404445595_1472881213
5th row27294437909732085_1472822600
ValueCountFrequency (%)
14108533830165900_1482391162 2
 
< 0.1%
7328657227470299189_1488700723 2
 
< 0.1%
2896409751679637889_1497249098 2
 
< 0.1%
1920231893487967243_1494224540 2
 
< 0.1%
9328692348677910181_1490338007 2
 
< 0.1%
2936634930290678437_1486195062 2
 
< 0.1%
3332070148916030858_1488700627 2
 
< 0.1%
31144590393948452_1499929196 2
 
< 0.1%
4203774291422453000_1499065071 2
 
< 0.1%
0461869076077289632_1495090739 2
 
< 0.1%
Other values (902745) 903633
> 99.9%
2024-02-20T07:41:00.678449image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3250435
12.0%
4 3181333
11.8%
7 2672658
9.9%
8 2612708
9.7%
9 2597893
9.6%
0 2417472
9.0%
5 2378325
8.8%
6 2333813
8.6%
3 2329535
8.6%
2 2322385
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26096557
96.7%
Connector Punctuation 903653
 
3.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3250435
12.5%
4 3181333
12.2%
7 2672658
10.2%
8 2612708
10.0%
9 2597893
10.0%
0 2417472
9.3%
5 2378325
9.1%
6 2333813
8.9%
3 2329535
8.9%
2 2322385
8.9%
Connector Punctuation
ValueCountFrequency (%)
_ 903653
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27000210
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3250435
12.0%
4 3181333
11.8%
7 2672658
9.9%
8 2612708
9.7%
9 2597893
9.6%
0 2417472
9.0%
5 2378325
8.8%
6 2333813
8.6%
3 2329535
8.6%
2 2322385
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27000210
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3250435
12.0%
4 3181333
11.8%
7 2672658
9.9%
8 2612708
9.7%
9 2597893
9.6%
0 2417472
9.0%
5 2378325
8.8%
6 2333813
8.6%
3 2329535
8.6%
2 2322385
8.6%

visitId
Real number (ℝ)

HIGH CORRELATION 

Distinct886303
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4850073 × 109
Minimum1.4700348 × 109
Maximum1.5016572 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:00.776519image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1.4700348 × 109
5-th percentile1.4716204 × 109
Q11.4775612 × 109
median1.4839487 × 109
Q31.4927588 × 109
95-th percentile1.4999783 × 109
Maximum1.5016572 × 109
Range31622381
Interquartile range (IQR)15197593

Descriptive statistics

Standard deviation9022123.6
Coefficient of variation (CV)0.0060754743
Kurtosis-1.1657356
Mean1.4850073 × 109
Median Absolute Deviation (MAD)7323946
Skewness0.19405148
Sum1.3419313 × 1015
Variance8.1398713 × 1013
MonotonicityNot monotonic
2024-02-20T07:41:00.851818image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1493146175 8
 
< 0.1%
1478345904 6
 
< 0.1%
1481369525 6
 
< 0.1%
1484649802 6
 
< 0.1%
1500856602 5
 
< 0.1%
1494374199 5
 
< 0.1%
1495031359 5
 
< 0.1%
1488904421 4
 
< 0.1%
1481365730 4
 
< 0.1%
1488190236 4
 
< 0.1%
Other values (886293) 903600
> 99.9%
ValueCountFrequency (%)
1470034812 1
< 0.1%
1470035066 1
< 0.1%
1470035081 1
< 0.1%
1470035161 1
< 0.1%
1470035170 1
< 0.1%
1470035292 1
< 0.1%
1470035429 1
< 0.1%
1470035457 1
< 0.1%
1470035501 1
< 0.1%
1470035521 1
< 0.1%
ValueCountFrequency (%)
1501657193 1
< 0.1%
1501657190 1
< 0.1%
1501657186 1
< 0.1%
1501657166 1
< 0.1%
1501657161 1
< 0.1%
1501657013 1
< 0.1%
1501656998 1
< 0.1%
1501656981 1
< 0.1%
1501656976 1
< 0.1%
1501656970 1
< 0.1%

visitNumber
Real number (ℝ)

Distinct384
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.264897
Minimum1
Maximum395
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:00.927954image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile5
Maximum395
Range394
Interquartile range (IQR)0

Descriptive statistics

Standard deviation9.2837345
Coefficient of variation (CV)4.0989654
Kurtosis517.3079
Mean2.264897
Median Absolute Deviation (MAD)0
Skewness19.998064
Sum2046681
Variance86.187726
MonotonicityNot monotonic
2024-02-20T07:41:00.997792image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 703060
77.8%
2 92548
 
10.2%
3 35843
 
4.0%
4 19157
 
2.1%
5 11615
 
1.3%
6 7677
 
0.8%
7 5413
 
0.6%
8 4031
 
0.4%
9 3084
 
0.3%
10 2415
 
0.3%
Other values (374) 18810
 
2.1%
ValueCountFrequency (%)
1 703060
77.8%
2 92548
 
10.2%
3 35843
 
4.0%
4 19157
 
2.1%
5 11615
 
1.3%
6 7677
 
0.8%
7 5413
 
0.6%
8 4031
 
0.4%
9 3084
 
0.3%
10 2415
 
0.3%
ValueCountFrequency (%)
395 1
< 0.1%
394 1
< 0.1%
393 1
< 0.1%
391 1
< 0.1%
390 1
< 0.1%
389 1
< 0.1%
388 1
< 0.1%
387 1
< 0.1%
386 1
< 0.1%
385 1
< 0.1%

visitStartTime
Real number (ℝ)

HIGH CORRELATION 

Distinct887159
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4850073 × 109
Minimum1.4700348 × 109
Maximum1.5016572 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:01.073274image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1.4700348 × 109
5-th percentile1.4716204 × 109
Q11.4775612 × 109
median1.4839489 × 109
Q31.4927588 × 109
95-th percentile1.4999783 × 109
Maximum1.5016572 × 109
Range31622381
Interquartile range (IQR)15197593

Descriptive statistics

Standard deviation9022123.7
Coefficient of variation (CV)0.0060754743
Kurtosis-1.1657356
Mean1.4850073 × 109
Median Absolute Deviation (MAD)7323976
Skewness0.19405153
Sum1.3419313 × 1015
Variance8.1398716 × 1013
MonotonicityNot monotonic
2024-02-20T07:41:01.147937image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1493146175 8
 
< 0.1%
1478345904 6
 
< 0.1%
1484649802 6
 
< 0.1%
1481369525 6
 
< 0.1%
1495031359 5
 
< 0.1%
1500856602 5
 
< 0.1%
1494374199 5
 
< 0.1%
1490274633 4
 
< 0.1%
1494590623 4
 
< 0.1%
1490264449 4
 
< 0.1%
Other values (887149) 903600
> 99.9%
ValueCountFrequency (%)
1470034812 1
< 0.1%
1470035066 1
< 0.1%
1470035081 1
< 0.1%
1470035161 1
< 0.1%
1470035170 1
< 0.1%
1470035292 1
< 0.1%
1470035429 1
< 0.1%
1470035457 1
< 0.1%
1470035501 1
< 0.1%
1470035521 1
< 0.1%
ValueCountFrequency (%)
1501657193 1
< 0.1%
1501657190 1
< 0.1%
1501657186 1
< 0.1%
1501657166 1
< 0.1%
1501657161 1
< 0.1%
1501657013 1
< 0.1%
1501656998 1
< 0.1%
1501656981 1
< 0.1%
1501656976 1
< 0.1%
1501656970 1
< 0.1%

continent
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.7 MiB
Americas
450377 
Asia
223698 
Europe
198311 
Oceania
 
15054
Africa
 
14745

Length

Max length9
Median length8
Mean length6.5232274
Min length4

Characters and Unicode

Total characters5894734
Distinct characters19
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAsia
2nd rowOceania
3rd rowEurope
4th rowAsia
5th rowEurope

Common Values

ValueCountFrequency (%)
Americas 450377
49.8%
Asia 223698
24.8%
Europe 198311
21.9%
Oceania 15054
 
1.7%
Africa 14745
 
1.6%
(not set) 1468
 
0.2%

Length

2024-02-20T07:41:01.220610image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:01.286916image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
americas 450377
49.8%
asia 223698
24.7%
europe 198311
21.9%
oceania 15054
 
1.7%
africa 14745
 
1.6%
not 1468
 
0.2%
set 1468
 
0.2%

Most occurring characters

ValueCountFrequency (%)
a 718928
12.2%
i 703874
11.9%
A 688820
11.7%
s 675543
11.5%
e 665210
11.3%
r 663433
11.3%
c 480176
8.1%
m 450377
7.6%
o 199779
 
3.4%
p 198311
 
3.4%
Other values (9) 450283
7.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4988145
84.6%
Uppercase Letter 902185
 
15.3%
Open Punctuation 1468
 
< 0.1%
Space Separator 1468
 
< 0.1%
Close Punctuation 1468
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 718928
14.4%
i 703874
14.1%
s 675543
13.5%
e 665210
13.3%
r 663433
13.3%
c 480176
9.6%
m 450377
9.0%
o 199779
 
4.0%
p 198311
 
4.0%
u 198311
 
4.0%
Other values (3) 34203
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
A 688820
76.4%
E 198311
 
22.0%
O 15054
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 1468
100.0%
Space Separator
ValueCountFrequency (%)
1468
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1468
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5890330
99.9%
Common 4404
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 718928
12.2%
i 703874
11.9%
A 688820
11.7%
s 675543
11.5%
e 665210
11.3%
r 663433
11.3%
c 480176
8.2%
m 450377
7.6%
o 199779
 
3.4%
p 198311
 
3.4%
Other values (6) 445879
7.6%
Common
ValueCountFrequency (%)
( 1468
33.3%
1468
33.3%
) 1468
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5894734
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 718928
12.2%
i 703874
11.9%
A 688820
11.7%
s 675543
11.5%
e 665210
11.3%
r 663433
11.3%
c 480176
8.1%
m 450377
7.6%
o 199779
 
3.4%
p 198311
 
3.4%
Other values (9) 450283
7.6%

subContinent
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size61.7 MiB
Northern America
390657 
Southeast Asia
77800 
Southern Asia
59321 
Western Europe
59114 
Northern Europe
58168 
Other values (18)
258593 

Length

Max length18
Median length16
Mean length14.621631
Min length9

Characters and Unicode

Total characters13212881
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWestern Asia
2nd rowAustralasia
3rd rowSouthern Europe
4th rowSoutheast Asia
5th rowNorthern Europe

Common Values

ValueCountFrequency (%)
Northern America 390657
43.2%
Southeast Asia 77800
 
8.6%
Southern Asia 59321
 
6.6%
Western Europe 59114
 
6.5%
Northern Europe 58168
 
6.4%
Eastern Asia 46919
 
5.2%
Eastern Europe 45249
 
5.0%
South America 41731
 
4.6%
Western Asia 38443
 
4.3%
Southern Europe 35780
 
4.0%
Other values (13) 50471
 
5.6%

Length

2024-02-20T07:41:01.350053image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
northern 456508
25.5%
america 447971
25.0%
asia 223698
12.5%
europe 198311
11.1%
western 100130
 
5.6%
southern 97270
 
5.4%
eastern 94095
 
5.3%
southeast 77800
 
4.3%
south 41731
 
2.3%
central 16798
 
0.9%
Other values (10) 35589
 
2.0%

Most occurring characters

ValueCountFrequency (%)
r 1899690
14.4%
e 1593577
12.1%
t 979961
 
7.4%
a 924840
 
7.0%
886248
 
6.7%
o 873223
 
6.6%
n 768946
 
5.8%
i 704377
 
5.3%
A 701307
 
5.3%
h 673309
 
5.1%
Other values (21) 3207403
24.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10536732
79.7%
Uppercase Letter 1786965
 
13.5%
Space Separator 886248
 
6.7%
Open Punctuation 1468
 
< 0.1%
Close Punctuation 1468
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1899690
18.0%
e 1593577
15.1%
t 979961
9.3%
a 924840
8.8%
o 873223
8.3%
n 768946
7.3%
i 704377
 
6.7%
h 673309
 
6.4%
s 527138
 
5.0%
c 462771
 
4.4%
Other values (9) 1128900
10.7%
Uppercase Letter
ValueCountFrequency (%)
A 701307
39.2%
N 456508
25.5%
E 292406
16.4%
S 216801
 
12.1%
W 100130
 
5.6%
C 19204
 
1.1%
M 529
 
< 0.1%
R 55
 
< 0.1%
P 25
 
< 0.1%
Space Separator
ValueCountFrequency (%)
886248
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1468
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1468
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12323697
93.3%
Common 889184
 
6.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1899690
15.4%
e 1593577
12.9%
t 979961
 
8.0%
a 924840
 
7.5%
o 873223
 
7.1%
n 768946
 
6.2%
i 704377
 
5.7%
A 701307
 
5.7%
h 673309
 
5.5%
s 527138
 
4.3%
Other values (18) 2677329
21.7%
Common
ValueCountFrequency (%)
886248
99.7%
( 1468
 
0.2%
) 1468
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13212881
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1899690
14.4%
e 1593577
12.1%
t 979961
 
7.4%
a 924840
 
7.0%
886248
 
6.7%
o 873223
 
6.6%
n 768946
 
5.8%
i 704377
 
5.3%
A 701307
 
5.3%
h 673309
 
5.1%
Other values (21) 3207403
24.3%
Distinct222
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size57.5 MiB
2024-02-20T07:41:01.478535image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length24
Median length22
Mean length9.7230386
Min length4

Characters and Unicode

Total characters8786253
Distinct characters63
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowTurkey
2nd rowAustralia
3rd rowSpain
4th rowIndonesia
5th rowUnited Kingdom
ValueCountFrequency (%)
united 405281
30.3%
states 364744
27.2%
india 51140
 
3.8%
kingdom 37393
 
2.8%
canada 25869
 
1.9%
vietnam 24598
 
1.8%
turkey 20522
 
1.5%
thailand 20123
 
1.5%
germany 19980
 
1.5%
brazil 19783
 
1.5%
Other values (252) 349718
26.1%
2024-02-20T07:41:01.718791image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 1238439
14.1%
e 1021620
11.6%
a 965182
11.0%
n 792306
9.0%
i 766000
8.7%
d 601015
 
6.8%
s 464391
 
5.3%
435498
 
5.0%
U 411769
 
4.7%
S 410463
 
4.7%
Other values (53) 1679570
19.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7005686
79.7%
Uppercase Letter 1338417
 
15.2%
Space Separator 435498
 
5.0%
Open Punctuation 2510
 
< 0.1%
Close Punctuation 2510
 
< 0.1%
Other Punctuation 1195
 
< 0.1%
Final Punctuation 320
 
< 0.1%
Dash Punctuation 117
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1238439
17.7%
e 1021620
14.6%
a 965182
13.8%
n 792306
11.3%
i 766000
10.9%
d 601015
8.6%
s 464391
 
6.6%
r 198000
 
2.8%
l 151162
 
2.2%
o 146869
 
2.1%
Other values (21) 660702
9.4%
Uppercase Letter
ValueCountFrequency (%)
U 411769
30.8%
S 410463
30.7%
I 85110
 
6.4%
T 55494
 
4.1%
K 50288
 
3.8%
C 44495
 
3.3%
A 32529
 
2.4%
P 32433
 
2.4%
B 31573
 
2.4%
V 26788
 
2.0%
Other values (15) 157475
 
11.8%
Other Punctuation
ValueCountFrequency (%)
& 1106
92.6%
. 89
 
7.4%
Space Separator
ValueCountFrequency (%)
435498
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2510
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2510
100.0%
Final Punctuation
ValueCountFrequency (%)
320
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 117
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8344103
95.0%
Common 442150
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1238439
14.8%
e 1021620
12.2%
a 965182
11.6%
n 792306
9.5%
i 766000
9.2%
d 601015
7.2%
s 464391
 
5.6%
U 411769
 
4.9%
S 410463
 
4.9%
r 198000
 
2.4%
Other values (46) 1474918
17.7%
Common
ValueCountFrequency (%)
435498
98.5%
( 2510
 
0.6%
) 2510
 
0.6%
& 1106
 
0.3%
320
 
0.1%
- 117
 
< 0.1%
. 89
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8785433
> 99.9%
None 500
 
< 0.1%
Punctuation 320
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 1238439
14.1%
e 1021620
11.6%
a 965182
11.0%
n 792306
9.0%
i 766000
8.7%
d 601015
 
6.8%
s 464391
 
5.3%
435498
 
5.0%
U 411769
 
4.7%
S 410463
 
4.7%
Other values (46) 1678750
19.1%
None
ValueCountFrequency (%)
ô 320
64.0%
é 147
29.4%
ç 30
 
6.0%
ã 1
 
0.2%
í 1
 
0.2%
Å 1
 
0.2%
Punctuation
ValueCountFrequency (%)
320
100.0%

region
Text

Distinct376
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size66.9 MiB
2024-02-20T07:41:01.874862image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length33
Median length29
Mean length20.613868
Min length4

Characters and Unicode

Total characters18627784
Distinct characters56
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIzmir
2nd rownot available in demo dataset
3rd rowCommunity of Madrid
4th rownot available in demo dataset
5th rownot available in demo dataset
ValueCountFrequency (%)
not 536056
17.2%
available 508229
16.3%
in 508229
16.3%
demo 508229
16.3%
dataset 508229
16.3%
california 107501
 
3.4%
new 33063
 
1.1%
set 27827
 
0.9%
york 26433
 
0.8%
of 13607
 
0.4%
Other values (452) 342152
11.0%
2024-02-20T07:41:02.095136image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3037450
16.3%
2215902
11.9%
e 1709137
9.2%
t 1704294
9.1%
i 1434438
7.7%
o 1338634
7.2%
n 1327976
7.1%
l 1227746
6.6%
d 1063436
 
5.7%
s 629109
 
3.4%
Other values (46) 2939662
15.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15832688
85.0%
Space Separator 2215902
 
11.9%
Uppercase Letter 512694
 
2.8%
Close Punctuation 27827
 
0.1%
Open Punctuation 27827
 
0.1%
Dash Punctuation 10582
 
0.1%
Other Punctuation 264
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3037450
19.2%
e 1709137
10.8%
t 1704294
10.8%
i 1434438
9.1%
o 1338634
8.5%
n 1327976
8.4%
l 1227746
7.8%
d 1063436
 
6.7%
s 629109
 
4.0%
m 532538
 
3.4%
Other values (16) 1827930
11.5%
Uppercase Letter
ValueCountFrequency (%)
C 145934
28.5%
N 43828
 
8.5%
M 36782
 
7.2%
T 35969
 
7.0%
Y 26465
 
5.2%
D 21969
 
4.3%
S 21727
 
4.2%
I 19813
 
3.9%
B 19351
 
3.8%
H 17118
 
3.3%
Other values (15) 123738
24.1%
Space Separator
ValueCountFrequency (%)
2215902
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27827
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27827
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10582
100.0%
Other Punctuation
ValueCountFrequency (%)
' 264
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16345382
87.7%
Common 2282402
 
12.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3037450
18.6%
e 1709137
10.5%
t 1704294
10.4%
i 1434438
8.8%
o 1338634
8.2%
n 1327976
8.1%
l 1227746
7.5%
d 1063436
 
6.5%
s 629109
 
3.8%
m 532538
 
3.3%
Other values (41) 2340624
14.3%
Common
ValueCountFrequency (%)
2215902
97.1%
) 27827
 
1.2%
( 27827
 
1.2%
- 10582
 
0.5%
' 264
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18627784
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3037450
16.3%
2215902
11.9%
e 1709137
9.2%
t 1704294
9.1%
i 1434438
7.7%
o 1338634
7.2%
n 1327976
7.1%
l 1227746
6.6%
d 1063436
 
5.7%
s 629109
 
3.4%
Other values (46) 2939662
15.8%

metro
Text

Distinct94
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size69.1 MiB
2024-02-20T07:41:02.186271image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length41
Median length29
Mean length23.187242
Min length6

Characters and Unicode

Total characters20953221
Distinct characters57
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row(not set)
2nd rownot available in demo dataset
3rd row(not set)
4th rownot available in demo dataset
5th rownot available in demo dataset
ValueCountFrequency (%)
not 709995
19.9%
in 508284
14.3%
available 508229
14.3%
demo 508229
14.3%
dataset 508229
14.3%
set 201766
 
5.7%
ca 107495
 
3.0%
san 97670
 
2.7%
francisco-oakland-san 95913
 
2.7%
jose 95913
 
2.7%
Other values (161) 221970
 
6.2%
2024-02-20T07:41:02.347566image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3084714
14.7%
2660040
12.7%
t 1983594
9.5%
e 1908676
9.1%
n 1674938
8.0%
o 1522326
7.3%
l 1144979
 
5.5%
i 1137893
 
5.4%
d 1129116
 
5.4%
s 947069
 
4.5%
Other values (47) 3759876
17.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16654529
79.5%
Space Separator 2660040
 
12.7%
Uppercase Letter 1015958
 
4.8%
Dash Punctuation 208990
 
1.0%
Open Punctuation 205397
 
1.0%
Close Punctuation 205397
 
1.0%
Other Punctuation 2593
 
< 0.1%
Connector Punctuation 317
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 201944
19.9%
A 142147
14.0%
C 122747
12.1%
O 99783
9.8%
F 98863
9.7%
J 96257
9.5%
N 58712
 
5.8%
Y 54047
 
5.3%
L 33820
 
3.3%
T 17187
 
1.7%
Other values (15) 90451
8.9%
Lowercase Letter
ValueCountFrequency (%)
a 3084714
18.5%
t 1983594
11.9%
e 1908676
11.5%
n 1674938
10.1%
o 1522326
9.1%
l 1144979
 
6.9%
i 1137893
 
6.8%
d 1129116
 
6.8%
s 947069
 
5.7%
m 517042
 
3.1%
Other values (14) 1604182
9.6%
Other Punctuation
ValueCountFrequency (%)
. 2520
97.2%
& 61
 
2.4%
, 12
 
0.5%
Space Separator
ValueCountFrequency (%)
2660040
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 208990
100.0%
Open Punctuation
ValueCountFrequency (%)
( 205397
100.0%
Close Punctuation
ValueCountFrequency (%)
) 205397
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 317
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17670487
84.3%
Common 3282734
 
15.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3084714
17.5%
t 1983594
11.2%
e 1908676
10.8%
n 1674938
9.5%
o 1522326
8.6%
l 1144979
 
6.5%
i 1137893
 
6.4%
d 1129116
 
6.4%
s 947069
 
5.4%
m 517042
 
2.9%
Other values (39) 2620140
14.8%
Common
ValueCountFrequency (%)
2660040
81.0%
- 208990
 
6.4%
( 205397
 
6.3%
) 205397
 
6.3%
. 2520
 
0.1%
_ 317
 
< 0.1%
& 61
 
< 0.1%
, 12
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20953221
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3084714
14.7%
2660040
12.7%
t 1983594
9.5%
e 1908676
9.1%
n 1674938
8.0%
o 1522326
7.3%
l 1144979
 
5.5%
i 1137893
 
5.4%
d 1129116
 
5.4%
s 947069
 
4.5%
Other values (47) 3759876
17.9%

city
Text

Distinct649
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size66.5 MiB
2024-02-20T07:41:02.528965image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length33
Median length29
Mean length20.198846
Min length3

Characters and Unicode

Total characters18252748
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIzmir
2nd rownot available in demo dataset
3rd rowMadrid
4th rownot available in demo dataset
5th rownot available in demo dataset
ValueCountFrequency (%)
not 542491
17.2%
available 508229
16.1%
in 508229
16.1%
demo 508229
16.1%
dataset 508229
16.1%
mountain 40884
 
1.3%
view 40884
 
1.3%
san 34510
 
1.1%
set 34262
 
1.1%
new 29593
 
0.9%
Other values (729) 392267
12.5%
2024-02-20T07:41:02.790630image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2851125
15.6%
2244154
12.3%
e 1772459
9.7%
t 1723610
9.4%
n 1388588
7.6%
o 1327491
7.3%
i 1247087
6.8%
l 1121717
 
6.1%
d 1065724
 
5.8%
s 638731
 
3.5%
Other values (48) 2872062
15.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15395189
84.3%
Space Separator 2244154
 
12.3%
Uppercase Letter 540915
 
3.0%
Close Punctuation 34262
 
0.2%
Open Punctuation 34262
 
0.2%
Dash Punctuation 3850
 
< 0.1%
Other Punctuation 116
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2851125
18.5%
e 1772459
11.5%
t 1723610
11.2%
n 1388588
9.0%
o 1327491
8.6%
i 1247087
8.1%
l 1121717
 
7.3%
d 1065724
 
6.9%
s 638731
 
4.1%
b 545505
 
3.5%
Other values (17) 1713152
11.1%
Uppercase Letter
ValueCountFrequency (%)
S 77212
14.3%
M 73829
13.6%
V 44431
 
8.2%
C 42122
 
7.8%
A 34771
 
6.4%
N 32069
 
5.9%
Y 30238
 
5.6%
B 29697
 
5.5%
L 27718
 
5.1%
H 25297
 
4.7%
Other values (15) 123531
22.8%
Other Punctuation
ValueCountFrequency (%)
' 92
79.3%
. 24
 
20.7%
Space Separator
ValueCountFrequency (%)
2244154
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34262
100.0%
Open Punctuation
ValueCountFrequency (%)
( 34262
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3850
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15936104
87.3%
Common 2316644
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2851125
17.9%
e 1772459
11.1%
t 1723610
10.8%
n 1388588
8.7%
o 1327491
8.3%
i 1247087
7.8%
l 1121717
 
7.0%
d 1065724
 
6.7%
s 638731
 
4.0%
b 545505
 
3.4%
Other values (42) 2254067
14.1%
Common
ValueCountFrequency (%)
2244154
96.9%
) 34262
 
1.5%
( 34262
 
1.5%
- 3850
 
0.2%
' 92
 
< 0.1%
. 24
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18252664
> 99.9%
None 84
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2851125
15.6%
2244154
12.3%
e 1772459
9.7%
t 1723610
9.4%
n 1388588
7.6%
o 1327491
7.3%
i 1247087
6.8%
l 1121717
 
6.1%
d 1065724
 
5.8%
s 638731
 
3.5%
Other values (47) 2871978
15.7%
None
ValueCountFrequency (%)
ã 84
100.0%
Distinct28064
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size58.9 MiB
2024-02-20T07:41:02.933424image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length37
Median length36
Mean length11.307548
Min length2

Characters and Unicode

Total characters10218100
Distinct characters44
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14649 ?
Unique (%)1.6%

Sample

1st rowttnet.com.tr
2nd rowdodo.net.au
3rd rowunknown.unknown
4th rowunknown.unknown
5th rowunknown.unknown
ValueCountFrequency (%)
not 244881
21.3%
set 244881
21.3%
unknown.unknown 146034
 
12.7%
comcast.net 28743
 
2.5%
rr.com 14827
 
1.3%
verizon.net 13637
 
1.2%
ttnet.com.tr 13228
 
1.2%
comcastbusiness.net 9985
 
0.9%
hinet.net 7919
 
0.7%
virginm.net 6414
 
0.6%
Other values (28055) 417985
36.4%
2024-02-20T07:41:03.149803image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1652112
16.2%
t 1109389
 
10.9%
o 976560
 
9.6%
e 853294
 
8.4%
. 764746
 
7.5%
s 481477
 
4.7%
u 410558
 
4.0%
c 386518
 
3.8%
w 332859
 
3.3%
k 332681
 
3.3%
Other values (34) 2917906
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8642089
84.6%
Other Punctuation 764767
 
7.5%
Open Punctuation 244881
 
2.4%
Close Punctuation 244881
 
2.4%
Space Separator 244881
 
2.4%
Decimal Number 39097
 
0.4%
Dash Punctuation 37500
 
0.4%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1652112
19.1%
t 1109389
12.8%
o 976560
11.3%
e 853294
9.9%
s 481477
 
5.6%
u 410558
 
4.8%
c 386518
 
4.5%
w 332859
 
3.9%
k 332681
 
3.8%
a 310767
 
3.6%
Other values (16) 1795874
20.8%
Decimal Number
ValueCountFrequency (%)
3 9720
24.9%
1 6911
17.7%
2 6125
15.7%
5 4066
10.4%
0 3520
 
9.0%
8 3008
 
7.7%
4 2346
 
6.0%
9 1865
 
4.8%
6 857
 
2.2%
7 679
 
1.7%
Other Punctuation
ValueCountFrequency (%)
. 764746
> 99.9%
\ 13
 
< 0.1%
@ 8
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 244881
100.0%
Close Punctuation
ValueCountFrequency (%)
) 244881
100.0%
Space Separator
ValueCountFrequency (%)
244881
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 37500
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8642089
84.6%
Common 1576011
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1652112
19.1%
t 1109389
12.8%
o 976560
11.3%
e 853294
9.9%
s 481477
 
5.6%
u 410558
 
4.8%
c 386518
 
4.5%
w 332859
 
3.9%
k 332681
 
3.8%
a 310767
 
3.6%
Other values (16) 1795874
20.8%
Common
ValueCountFrequency (%)
. 764746
48.5%
( 244881
 
15.5%
) 244881
 
15.5%
244881
 
15.5%
- 37500
 
2.4%
3 9720
 
0.6%
1 6911
 
0.4%
2 6125
 
0.4%
5 4066
 
0.3%
0 3520
 
0.2%
Other values (8) 8780
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10218100
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1652112
16.2%
t 1109389
 
10.9%
o 976560
 
9.6%
e 853294
 
8.4%
. 764746
 
7.5%
s 481477
 
4.7%
u 410558
 
4.0%
c 386518
 
3.8%
w 332859
 
3.3%
k 332681
 
3.3%
Other values (34) 2917906
28.6%
Distinct54
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.7 MiB
2024-02-20T07:41:03.232344image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length43
Median length6
Mean length6.460627
Min length1

Characters and Unicode

Total characters5838165
Distinct characters68
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowChrome
2nd rowFirefox
3rd rowChrome
4th rowUC Browser
5th rowChrome
ValueCountFrequency (%)
chrome 620365
65.4%
safari 189095
 
19.9%
firefox 37069
 
3.9%
internet 19375
 
2.0%
explorer 19375
 
2.0%
opera 11782
 
1.2%
edge 10205
 
1.1%
android 8420
 
0.9%
webview 7865
 
0.8%
in-app 6850
 
0.7%
Other values (60) 18727
 
2.0%
2024-02-20T07:41:03.369791image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 935843
16.0%
e 759760
13.0%
o 693831
11.9%
C 624899
10.7%
m 621325
10.6%
h 620632
10.6%
a 400787
6.9%
i 263237
 
4.5%
f 226356
 
3.9%
S 189714
 
3.2%
Other values (58) 501781
8.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4822831
82.6%
Uppercase Letter 948991
 
16.3%
Space Separator 45475
 
0.8%
Dash Punctuation 6866
 
0.1%
Open Punctuation 6859
 
0.1%
Close Punctuation 6859
 
0.1%
Decimal Number 201
 
< 0.1%
Connector Punctuation 82
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 935843
19.4%
e 759760
15.8%
o 693831
14.4%
m 621325
12.9%
h 620632
12.9%
a 400787
8.3%
i 263237
 
5.5%
f 226356
 
4.7%
n 61814
 
1.3%
x 56690
 
1.2%
Other values (16) 182556
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
C 624899
65.8%
S 189714
 
20.0%
F 37094
 
3.9%
E 29851
 
3.1%
I 19410
 
2.0%
O 12046
 
1.3%
A 9371
 
1.0%
W 7865
 
0.8%
M 7320
 
0.8%
B 5654
 
0.6%
Other values (14) 5767
 
0.6%
Decimal Number
ValueCountFrequency (%)
2 63
31.3%
0 59
29.4%
1 29
14.4%
4 24
 
11.9%
7 10
 
5.0%
5 6
 
3.0%
3 4
 
2.0%
9 4
 
2.0%
8 1
 
0.5%
6 1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 6858
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 6858
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
45475
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6866
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 82
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5771822
98.9%
Common 66343
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 935843
16.2%
e 759760
13.2%
o 693831
12.0%
C 624899
10.8%
m 621325
10.8%
h 620632
10.8%
a 400787
6.9%
i 263237
 
4.6%
f 226356
 
3.9%
S 189714
 
3.3%
Other values (40) 435438
7.5%
Common
ValueCountFrequency (%)
45475
68.5%
- 6866
 
10.3%
( 6858
 
10.3%
) 6858
 
10.3%
_ 82
 
0.1%
2 63
 
0.1%
0 59
 
0.1%
1 29
 
< 0.1%
4 24
 
< 0.1%
7 10
 
< 0.1%
Other values (8) 19
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5838165
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 935843
16.0%
e 759760
13.0%
o 693831
11.9%
C 624899
10.7%
m 621325
10.6%
h 620632
10.6%
a 400787
6.9%
i 263237
 
4.5%
f 226356
 
3.9%
S 189714
 
3.2%
Other values (58) 501781
8.6%

operatingSystem
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.2 MiB
Windows
350072 
Macintosh
253938 
Android
123892 
iOS
107665 
Linux
35034 
Other values (15)
 
33052

Length

Max length13
Median length7
Mean length7.0862532
Min length3

Characters and Unicode

Total characters6403514
Distinct characters41
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowWindows
2nd rowMacintosh
3rd rowWindows
4th rowLinux
5th rowAndroid

Common Values

ValueCountFrequency (%)
Windows 350072
38.7%
Macintosh 253938
28.1%
Android 123892
 
13.7%
iOS 107665
 
11.9%
Linux 35034
 
3.9%
Chrome OS 26337
 
2.9%
(not set) 4695
 
0.5%
Windows Phone 1216
 
0.1%
Samsung 280
 
< 0.1%
BlackBerry 218
 
< 0.1%
Other values (10) 306
 
< 0.1%

Length

2024-02-20T07:41:03.444937image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
windows 351288
37.5%
macintosh 253938
27.1%
android 123892
 
13.2%
ios 107665
 
11.5%
linux 35034
 
3.7%
os 26426
 
2.8%
chrome 26337
 
2.8%
not 4695
 
0.5%
set 4695
 
0.5%
phone 1216
 
0.1%
Other values (14) 941
 
0.1%

Most occurring characters

ValueCountFrequency (%)
i 872314
13.6%
n 770618
12.0%
o 761662
11.9%
s 610201
9.5%
d 599208
9.4%
W 351423
 
5.5%
w 351288
 
5.5%
h 281491
 
4.4%
t 263464
 
4.1%
a 254438
 
4.0%
Other values (31) 1287407
20.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5300528
82.8%
Uppercase Letter 1061121
 
16.6%
Space Separator 32474
 
0.5%
Open Punctuation 4695
 
0.1%
Close Punctuation 4695
 
0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 872314
16.5%
n 770618
14.5%
o 761662
14.4%
s 610201
11.5%
d 599208
11.3%
w 351288
6.6%
h 281491
 
5.3%
t 263464
 
5.0%
a 254438
 
4.8%
c 254156
 
4.8%
Other values (12) 281688
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
W 351423
33.1%
M 253939
23.9%
S 134385
 
12.7%
O 134094
 
12.6%
A 123892
 
11.7%
L 35034
 
3.3%
C 26338
 
2.5%
P 1216
 
0.1%
B 447
 
< 0.1%
N 139
 
< 0.1%
Other values (5) 214
 
< 0.1%
Space Separator
ValueCountFrequency (%)
32474
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4695
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4695
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6361649
99.3%
Common 41865
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 872314
13.7%
n 770618
12.1%
o 761662
12.0%
s 610201
9.6%
d 599208
9.4%
W 351423
 
5.5%
w 351288
 
5.5%
h 281491
 
4.4%
t 263464
 
4.1%
a 254438
 
4.0%
Other values (27) 1245542
19.6%
Common
ValueCountFrequency (%)
32474
77.6%
( 4695
 
11.2%
) 4695
 
11.2%
3 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6403514
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 872314
13.6%
n 770618
12.0%
o 761662
11.9%
s 610201
9.5%
d 599208
9.4%
W 351423
 
5.5%
w 351288
 
5.5%
h 281491
 
4.4%
t 263464
 
4.1%
a 254438
 
4.0%
Other values (31) 1287407
20.1%

isMobile
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size882.6 KiB
False
664530 
True
239123 
ValueCountFrequency (%)
False 664530
73.5%
True 239123
 
26.5%
2024-02-20T07:41:03.502174image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

deviceCategory
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.9 MiB
desktop
664479 
mobile
208725 
tablet
 
30449

Length

Max length7
Median length7
Mean length6.7353254
Min length6

Characters and Unicode

Total characters6086397
Distinct characters12
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdesktop
2nd rowdesktop
3rd rowdesktop
4th rowdesktop
5th rowmobile

Common Values

ValueCountFrequency (%)
desktop 664479
73.5%
mobile 208725
 
23.1%
tablet 30449
 
3.4%

Length

2024-02-20T07:41:03.550764image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:03.609349image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
desktop 664479
73.5%
mobile 208725
 
23.1%
tablet 30449
 
3.4%

Most occurring characters

ValueCountFrequency (%)
e 903653
14.8%
o 873204
14.3%
t 725377
11.9%
d 664479
10.9%
s 664479
10.9%
k 664479
10.9%
p 664479
10.9%
b 239174
 
3.9%
l 239174
 
3.9%
m 208725
 
3.4%
Other values (2) 239174
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6086397
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 903653
14.8%
o 873204
14.3%
t 725377
11.9%
d 664479
10.9%
s 664479
10.9%
k 664479
10.9%
p 664479
10.9%
b 239174
 
3.9%
l 239174
 
3.9%
m 208725
 
3.4%
Other values (2) 239174
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 6086397
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 903653
14.8%
o 873204
14.3%
t 725377
11.9%
d 664479
10.9%
s 664479
10.9%
k 664479
10.9%
p 664479
10.9%
b 239174
 
3.9%
l 239174
 
3.9%
m 208725
 
3.4%
Other values (2) 239174
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6086397
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 903653
14.8%
o 873204
14.3%
t 725377
11.9%
d 664479
10.9%
s 664479
10.9%
k 664479
10.9%
p 664479
10.9%
b 239174
 
3.9%
l 239174
 
3.9%
m 208725
 
3.4%
Other values (2) 239174
 
3.9%

hits
Real number (ℝ)

HIGH CORRELATION 

Distinct274
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5965376
Minimum1
Maximum500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:03.665436image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile18
Maximum500
Range499
Interquartile range (IQR)3

Descriptive statistics

Standard deviation9.6414371
Coefficient of variation (CV)2.0975434
Kurtosis230.35198
Mean4.5965376
Median Absolute Deviation (MAD)1
Skewness9.7804556
Sum4153675
Variance92.957308
MonotonicityNot monotonic
2024-02-20T07:41:03.739669image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 446754
49.4%
2 137952
 
15.3%
3 70402
 
7.8%
4 42444
 
4.7%
5 30939
 
3.4%
6 23918
 
2.6%
7 19518
 
2.2%
8 15484
 
1.7%
9 12959
 
1.4%
10 10640
 
1.2%
Other values (264) 92643
 
10.3%
ValueCountFrequency (%)
1 446754
49.4%
2 137952
 
15.3%
3 70402
 
7.8%
4 42444
 
4.7%
5 30939
 
3.4%
6 23918
 
2.6%
7 19518
 
2.2%
8 15484
 
1.7%
9 12959
 
1.4%
10 10640
 
1.2%
ValueCountFrequency (%)
500 10
< 0.1%
489 1
 
< 0.1%
483 1
 
< 0.1%
471 1
 
< 0.1%
445 1
 
< 0.1%
437 1
 
< 0.1%
406 1
 
< 0.1%
387 1
 
< 0.1%
386 1
 
< 0.1%
385 2
 
< 0.1%

pageviews
Real number (ℝ)

HIGH CORRELATION 

Distinct213
Distinct (%)< 0.1%
Missing100
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean3.8497642
Minimum1
Maximum469
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:03.809196image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile15
Maximum469
Range468
Interquartile range (IQR)3

Descriptive statistics

Standard deviation7.025274
Coefficient of variation (CV)1.8248582
Kurtosis237.41488
Mean3.8497642
Median Absolute Deviation (MAD)0
Skewness9.2150555
Sum3478466
Variance49.354474
MonotonicityNot monotonic
2024-02-20T07:41:03.879094image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 452522
50.1%
2 143770
 
15.9%
3 73835
 
8.2%
4 45192
 
5.0%
5 33411
 
3.7%
6 24688
 
2.7%
7 19476
 
2.2%
8 15272
 
1.7%
9 12585
 
1.4%
10 10104
 
1.1%
Other values (203) 72698
 
8.0%
ValueCountFrequency (%)
1 452522
50.1%
2 143770
 
15.9%
3 73835
 
8.2%
4 45192
 
5.0%
5 33411
 
3.7%
6 24688
 
2.7%
7 19476
 
2.2%
8 15272
 
1.7%
9 12585
 
1.4%
10 10104
 
1.1%
ValueCountFrequency (%)
469 1
< 0.1%
466 1
< 0.1%
431 1
< 0.1%
429 1
< 0.1%
400 1
< 0.1%
358 1
< 0.1%
351 1
< 0.1%
343 1
< 0.1%
341 2
< 0.1%
340 1
< 0.1%

transactionRevenue
Real number (ℝ)

MISSING  SKEWED 

Distinct5332
Distinct (%)46.3%
Missing892138
Missing (%)98.7%
Infinite0
Infinite (%)0.0%
Mean1.3374479 × 108
Minimum10000
Maximum2.31295 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:03.947485image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile10630000
Q124930000
median49450000
Q31.07655 × 108
95-th percentile4.92576 × 108
Maximum2.31295 × 1010
Range2.312949 × 1010
Interquartile range (IQR)82725000

Descriptive statistics

Standard deviation4.4828523 × 108
Coefficient of variation (CV)3.3517959
Kurtosis1020.3068
Mean1.3374479 × 108
Median Absolute Deviation (MAD)30470000
Skewness25.722703
Sum1.5400712 × 1012
Variance2.0095965 × 1017
MonotonicityNot monotonic
2024-02-20T07:41:04.017368image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16990000 256
 
< 0.1%
18990000 189
 
< 0.1%
33590000 187
 
< 0.1%
44790000 170
 
< 0.1%
13590000 135
 
< 0.1%
55990000 122
 
< 0.1%
19990000 116
 
< 0.1%
15990000 98
 
< 0.1%
15190000 93
 
< 0.1%
19190000 92
 
< 0.1%
Other values (5322) 10057
 
1.1%
(Missing) 892138
98.7%
ValueCountFrequency (%)
10000 1
 
< 0.1%
40000 1
 
< 0.1%
90000 1
 
< 0.1%
160000 1
 
< 0.1%
200000 1
 
< 0.1%
490000 1
 
< 0.1%
770000 1
 
< 0.1%
790000 1
 
< 0.1%
990000 1
 
< 0.1%
1200000 7
< 0.1%
ValueCountFrequency (%)
2.31295 × 10101
< 0.1%
1.78555 × 10101
< 0.1%
1.602375 × 10101
< 0.1%
1.058914 × 10101
< 0.1%
8677830000 1
< 0.1%
8248800000 1
< 0.1%
6996500000 1
< 0.1%
6826960000 1
< 0.1%
6248750000 1
< 0.1%
5614440000 1
< 0.1%

campaign
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size57.4 MiB
(not set)
865347 
Data Share Promo
 
16403
AW - Dynamic Search Ads Whole Site
 
14244
AW - Accessories
 
7070
test-liyuhz
 
392
Other values (5)
 
197

Length

Max length47
Median length9
Mean length9.5797779
Min length9

Characters and Unicode

Total characters8656795
Distinct characters34
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row(not set)
2nd row(not set)
3rd row(not set)
4th row(not set)
5th row(not set)

Common Values

ValueCountFrequency (%)
(not set) 865347
95.8%
Data Share Promo 16403
 
1.8%
AW - Dynamic Search Ads Whole Site 14244
 
1.6%
AW - Accessories 7070
 
0.8%
test-liyuhz 392
 
< 0.1%
AW - Electronics 96
 
< 0.1%
Retail (DO NOT EDIT owners nophakun and tianyu) 50
 
< 0.1%
AW - Apparel 46
 
< 0.1%
All Products 4
 
< 0.1%
Data Share 1
 
< 0.1%

Length

2024-02-20T07:41:04.085887image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:04.158592image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
not 865397
45.5%
set 865347
45.5%
aw 21456
 
1.1%
21456
 
1.1%
data 16404
 
0.9%
share 16404
 
0.9%
promo 16403
 
0.9%
dynamic 14244
 
0.7%
search 14244
 
0.7%
ads 14244
 
0.7%
Other values (15) 36450
 
1.9%

Most occurring characters

ValueCountFrequency (%)
t 1762326
20.4%
998396
11.5%
e 939257
10.8%
o 919667
10.6%
s 901343
10.4%
n 879937
10.2%
) 865397
10.0%
( 865397
10.0%
a 77946
 
0.9%
r 54317
 
0.6%
Other values (24) 392812
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5734694
66.2%
Space Separator 998396
 
11.5%
Close Punctuation 865397
 
10.0%
Open Punctuation 865397
 
10.0%
Uppercase Letter 171063
 
2.0%
Dash Punctuation 21848
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1762326
30.7%
e 939257
16.4%
o 919667
16.0%
s 901343
15.7%
n 879937
15.3%
a 77946
 
1.4%
r 54317
 
0.9%
h 45334
 
0.8%
c 42824
 
0.7%
i 36146
 
0.6%
Other values (9) 75597
 
1.3%
Uppercase Letter
ValueCountFrequency (%)
S 44892
26.2%
A 42820
25.0%
W 35700
20.9%
D 30748
18.0%
P 16407
 
9.6%
E 146
 
0.1%
O 100
 
0.1%
T 100
 
0.1%
R 50
 
< 0.1%
N 50
 
< 0.1%
Space Separator
ValueCountFrequency (%)
998396
100.0%
Close Punctuation
ValueCountFrequency (%)
) 865397
100.0%
Open Punctuation
ValueCountFrequency (%)
( 865397
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21848
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5905757
68.2%
Common 2751038
31.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 1762326
29.8%
e 939257
15.9%
o 919667
15.6%
s 901343
15.3%
n 879937
14.9%
a 77946
 
1.3%
r 54317
 
0.9%
h 45334
 
0.8%
S 44892
 
0.8%
c 42824
 
0.7%
Other values (20) 237914
 
4.0%
Common
ValueCountFrequency (%)
998396
36.3%
) 865397
31.5%
( 865397
31.5%
- 21848
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8656795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 1762326
20.4%
998396
11.5%
e 939257
10.8%
o 919667
10.6%
s 901343
10.4%
n 879937
10.2%
) 865397
10.0%
( 865397
10.0%
a 77946
 
0.9%
r 54317
 
0.6%
Other values (24) 392812
 
4.5%

source
Text

Distinct380
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size56.9 MiB
2024-02-20T07:41:04.239184image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length60
Median length49
Mean length9.0206938
Min length3

Characters and Unicode

Total characters8151577
Distinct characters43
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)< 0.1%

Sample

1st rowgoogle
2nd rowgoogle
3rd rowgoogle
4th rowgoogle
5th rowgoogle
ValueCountFrequency (%)
google 400788
44.3%
youtube.com 212602
23.5%
direct 143028
 
15.8%
mall.googleplex.com 66416
 
7.3%
partners 16411
 
1.8%
analytics.google.com 16172
 
1.8%
dfa 5686
 
0.6%
google.com 4669
 
0.5%
m.facebook.com 3365
 
0.4%
baidu 3356
 
0.4%
Other values (371) 31229
 
3.5%
2024-02-20T07:41:04.401049image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 1571376
19.3%
g 1006376
12.3%
e 958629
11.8%
l 730101
9.0%
c 500801
 
6.1%
u 437502
 
5.4%
. 434839
 
5.3%
t 402952
 
4.9%
m 401095
 
4.9%
y 233355
 
2.9%
Other values (33) 1474551
18.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7412657
90.9%
Other Punctuation 434933
 
5.3%
Open Punctuation 143097
 
1.8%
Close Punctuation 143097
 
1.8%
Uppercase Letter 16411
 
0.2%
Decimal Number 971
 
< 0.1%
Dash Punctuation 342
 
< 0.1%
Space Separator 69
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1571376
21.2%
g 1006376
13.6%
e 958629
12.9%
l 730101
9.8%
c 500801
 
6.8%
u 437502
 
5.9%
t 402952
 
5.4%
m 401095
 
5.4%
y 233355
 
3.1%
b 228412
 
3.1%
Other values (16) 942058
12.7%
Decimal Number
ValueCountFrequency (%)
0 305
31.4%
2 273
28.1%
8 132
13.6%
9 97
 
10.0%
3 39
 
4.0%
1 38
 
3.9%
5 31
 
3.2%
6 24
 
2.5%
4 20
 
2.1%
7 12
 
1.2%
Other Punctuation
ValueCountFrequency (%)
. 434839
> 99.9%
: 94
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 143097
100.0%
Close Punctuation
ValueCountFrequency (%)
) 143097
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 16411
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 342
100.0%
Space Separator
ValueCountFrequency (%)
69
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7429068
91.1%
Common 722509
 
8.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1571376
21.2%
g 1006376
13.5%
e 958629
12.9%
l 730101
9.8%
c 500801
 
6.7%
u 437502
 
5.9%
t 402952
 
5.4%
m 401095
 
5.4%
y 233355
 
3.1%
b 228412
 
3.1%
Other values (17) 958469
12.9%
Common
ValueCountFrequency (%)
. 434839
60.2%
( 143097
 
19.8%
) 143097
 
19.8%
- 342
 
< 0.1%
0 305
 
< 0.1%
2 273
 
< 0.1%
8 132
 
< 0.1%
9 97
 
< 0.1%
: 94
 
< 0.1%
69
 
< 0.1%
Other values (6) 164
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8151577
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1571376
19.3%
g 1006376
12.3%
e 958629
11.8%
l 730101
9.0%
c 500801
 
6.1%
u 437502
 
5.4%
. 434839
 
5.3%
t 402952
 
4.9%
m 401095
 
4.9%
y 233355
 
2.9%
Other values (33) 1474551
18.1%

medium
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.2 MiB
organic
381561 
referral
330955 
(none)
143026 
cpc
 
25326
affiliate
 
16403
Other values (2)
 
6382

Length

Max length9
Median length8
Mean length7.1047117
Min length3

Characters and Unicode

Total characters6420194
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st roworganic
2nd roworganic
3rd roworganic
4th roworganic
5th roworganic

Common Values

ValueCountFrequency (%)
organic 381561
42.2%
referral 330955
36.6%
(none) 143026
 
15.8%
cpc 25326
 
2.8%
affiliate 16403
 
1.8%
cpm 6262
 
0.7%
(not set) 120
 
< 0.1%

Length

2024-02-20T07:41:04.476766image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:04.543319image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
organic 381561
42.2%
referral 330955
36.6%
none 143026
 
15.8%
cpc 25326
 
2.8%
affiliate 16403
 
1.8%
cpm 6262
 
0.7%
not 120
 
< 0.1%
set 120
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
r 1374426
21.4%
e 821459
12.8%
a 745322
11.6%
n 667733
10.4%
o 524707
 
8.2%
c 438475
 
6.8%
i 414367
 
6.5%
g 381561
 
5.9%
f 363761
 
5.7%
l 347358
 
5.4%
Other values (7) 341025
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6133782
95.5%
Open Punctuation 143146
 
2.2%
Close Punctuation 143146
 
2.2%
Space Separator 120
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1374426
22.4%
e 821459
13.4%
a 745322
12.2%
n 667733
10.9%
o 524707
 
8.6%
c 438475
 
7.1%
i 414367
 
6.8%
g 381561
 
6.2%
f 363761
 
5.9%
l 347358
 
5.7%
Other values (4) 54613
 
0.9%
Open Punctuation
ValueCountFrequency (%)
( 143146
100.0%
Close Punctuation
ValueCountFrequency (%)
) 143146
100.0%
Space Separator
ValueCountFrequency (%)
120
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6133782
95.5%
Common 286412
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1374426
22.4%
e 821459
13.4%
a 745322
12.2%
n 667733
10.9%
o 524707
 
8.6%
c 438475
 
7.1%
i 414367
 
6.8%
g 381561
 
6.2%
f 363761
 
5.9%
l 347358
 
5.7%
Other values (4) 54613
 
0.9%
Common
ValueCountFrequency (%)
( 143146
50.0%
) 143146
50.0%
120
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6420194
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 1374426
21.4%
e 821459
12.8%
a 745322
11.6%
n 667733
10.4%
o 524707
 
8.2%
c 438475
 
6.8%
i 414367
 
6.5%
g 381561
 
5.9%
f 363761
 
5.7%
l 347358
 
5.4%
Other values (7) 341025
 
5.3%

keyword
Text

MISSING 

Distinct3659
Distinct (%)0.9%
Missing502929
Missing (%)55.7%
Memory size42.6 MiB
2024-02-20T07:41:04.650568image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length147
Median length14
Mean length14.332388
Min length1

Characters and Unicode

Total characters5743332
Distinct characters295
Distinct categories18 ?
Distinct scripts13 ?
Distinct blocks14 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2763 ?
Unique (%)0.7%

Sample

1st row(not provided)
2nd row(not provided)
3rd row(not provided)
4th rowgoogle + online
5th row(not provided)
ValueCountFrequency (%)
not 366363
46.2%
provided 366363
46.2%
6qehscssdk0z36ri 11503
 
1.5%
google 10301
 
1.3%
merchandise 5671
 
0.7%
store 5316
 
0.7%
targeting 3086
 
0.4%
youtube 2833
 
0.4%
remarketing/content 2298
 
0.3%
1hzbaqlcbjwfgoh7 2264
 
0.3%
Other values (2422) 16694
 
2.1%
2024-02-20T07:41:05.053770image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 772784
13.5%
d 752095
13.1%
e 416576
 
7.3%
r 401124
 
7.0%
t 397351
 
6.9%
i 395175
 
6.9%
391968
 
6.8%
n 386835
 
6.7%
( 369878
 
6.4%
) 369877
 
6.4%
Other values (285) 1089669
19.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4480669
78.0%
Space Separator 391968
 
6.8%
Open Punctuation 369878
 
6.4%
Close Punctuation 369877
 
6.4%
Uppercase Letter 66215
 
1.2%
Decimal Number 53129
 
0.9%
Other Punctuation 5413
 
0.1%
Math Symbol 4796
 
0.1%
Dash Punctuation 690
 
< 0.1%
Other Letter 496
 
< 0.1%
Other values (8) 201
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
7.7%
37
 
7.5%
34
 
6.9%
32
 
6.5%
17
 
3.4%
16
 
3.2%
13
 
2.6%
12
 
2.4%
12
 
2.4%
11
 
2.2%
Other values (118) 274
55.2%
Lowercase Letter
ValueCountFrequency (%)
o 772784
17.2%
d 752095
16.8%
e 416576
9.3%
r 401124
9.0%
t 397351
8.9%
i 395175
8.8%
n 386835
8.6%
p 368763
8.2%
v 367210
8.2%
s 52440
 
1.2%
Other values (71) 170316
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
C 16135
24.4%
K 12009
18.1%
E 11748
17.7%
G 3090
 
4.7%
M 2992
 
4.5%
Z 2732
 
4.1%
A 2494
 
3.8%
R 2472
 
3.7%
O 2455
 
3.7%
H 2376
 
3.6%
Other values (18) 7712
11.6%
Other Punctuation
ValueCountFrequency (%)
/ 3604
66.6%
. 1181
 
21.8%
: 332
 
6.1%
& 144
 
2.7%
? 43
 
0.8%
' 33
 
0.6%
* 26
 
0.5%
, 17
 
0.3%
" 15
 
0.3%
# 7
 
0.1%
Other values (4) 11
 
0.2%
Spacing Mark
ValueCountFrequency (%)
29
44.6%
14
21.5%
ি 8
 
12.3%
3
 
4.6%
3
 
4.6%
2
 
3.1%
2
 
3.1%
1
 
1.5%
1
 
1.5%
1
 
1.5%
Decimal Number
ValueCountFrequency (%)
6 23566
44.4%
0 12053
22.7%
3 11550
21.7%
1 2936
 
5.5%
7 2288
 
4.3%
4 507
 
1.0%
2 127
 
0.2%
5 53
 
0.1%
9 32
 
0.1%
8 17
 
< 0.1%
Nonspacing Mark
ValueCountFrequency (%)
9
31.0%
5
17.2%
5
17.2%
3
 
10.3%
2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Dash Punctuation
ValueCountFrequency (%)
- 688
99.7%
1
 
0.1%
1
 
0.1%
Math Symbol
ValueCountFrequency (%)
+ 4577
95.4%
= 219
 
4.6%
Space Separator
ValueCountFrequency (%)
391968
100.0%
Open Punctuation
ValueCountFrequency (%)
( 369878
100.0%
Close Punctuation
ValueCountFrequency (%)
) 369877
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 99
100.0%
Format
ValueCountFrequency (%)
3
100.0%
Modifier Letter
ValueCountFrequency (%)
2
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
¨ 1
100.0%
Other Symbol
ValueCountFrequency (%)
👉 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4546613
79.2%
Common 1195858
 
20.8%
Cyrillic 227
 
< 0.1%
Bengali 190
 
< 0.1%
Han 186
 
< 0.1%
Katakana 71
 
< 0.1%
Devanagari 69
 
< 0.1%
Arabic 50
 
< 0.1%
Greek 44
 
< 0.1%
Hangul 10
 
< 0.1%
Other values (3) 14
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 772784
17.0%
d 752095
16.5%
e 416576
9.2%
r 401124
8.8%
t 397351
8.7%
i 395175
8.7%
n 386835
8.5%
p 368763
8.1%
v 367210
8.1%
s 52440
 
1.2%
Other values (59) 236260
 
5.2%
Common
ValueCountFrequency (%)
391968
32.8%
( 369878
30.9%
) 369877
30.9%
6 23566
 
2.0%
0 12053
 
1.0%
3 11550
 
1.0%
+ 4577
 
0.4%
/ 3604
 
0.3%
1 2936
 
0.2%
7 2288
 
0.2%
Other values (28) 3561
 
0.3%
Bengali
ValueCountFrequency (%)
29
15.3%
17
 
8.9%
16
 
8.4%
14
 
7.4%
9
 
4.7%
9
 
4.7%
8
 
4.2%
ি 8
 
4.2%
7
 
3.7%
6
 
3.2%
Other values (25) 67
35.3%
Han
ValueCountFrequency (%)
38
20.4%
37
19.9%
34
18.3%
32
17.2%
11
 
5.9%
10
 
5.4%
2
 
1.1%
1
 
0.5%
1
 
0.5%
1
 
0.5%
Other values (19) 19
10.2%
Cyrillic
ValueCountFrequency (%)
а 21
 
9.3%
о 20
 
8.8%
г 20
 
8.8%
т 19
 
8.4%
р 16
 
7.0%
н 16
 
7.0%
и 13
 
5.7%
у 13
 
5.7%
к 11
 
4.8%
е 11
 
4.8%
Other values (15) 67
29.5%
Devanagari
ValueCountFrequency (%)
7
 
10.1%
7
 
10.1%
5
 
7.2%
5
 
7.2%
5
 
7.2%
4
 
5.8%
4
 
5.8%
4
 
5.8%
3
 
4.3%
3
 
4.3%
Other values (15) 22
31.9%
Katakana
ValueCountFrequency (%)
13
18.3%
12
16.9%
12
16.9%
5
 
7.0%
5
 
7.0%
3
 
4.2%
3
 
4.2%
2
 
2.8%
2
 
2.8%
2
 
2.8%
Other values (11) 12
16.9%
Arabic
ValueCountFrequency (%)
و 11
22.0%
ي 10
20.0%
ت 7
14.0%
ب 6
12.0%
غ 2
 
4.0%
ل 2
 
4.0%
ج 2
 
4.0%
س 2
 
4.0%
ی 2
 
4.0%
ح 1
 
2.0%
Other values (5) 5
10.0%
Greek
ValueCountFrequency (%)
ο 8
18.2%
λ 5
11.4%
α 5
11.4%
τ 5
11.4%
γ 3
 
6.8%
π 3
 
6.8%
υ 3
 
6.8%
ι 3
 
6.8%
ε 3
 
6.8%
η 1
 
2.3%
Other values (5) 5
11.4%
Hangul
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Hebrew
ValueCountFrequency (%)
ס 2
28.6%
ק 1
14.3%
י 1
14.3%
ר 1
14.3%
ת 1
14.3%
א 1
14.3%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Gurmukhi
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5742438
> 99.9%
Cyrillic 227
 
< 0.1%
Bengali 190
 
< 0.1%
CJK 186
 
< 0.1%
Katakana 73
 
< 0.1%
None 70
 
< 0.1%
Devanagari 69
 
< 0.1%
Arabic 50
 
< 0.1%
Hebrew 7
 
< 0.1%
Compat Jamo 6
 
< 0.1%
Other values (4) 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 772784
13.5%
d 752095
13.1%
e 416576
 
7.3%
r 401124
 
7.0%
t 397351
 
6.9%
i 395175
 
6.9%
391968
 
6.8%
n 386835
 
6.7%
( 369878
 
6.4%
) 369877
 
6.4%
Other values (74) 1088775
19.0%
CJK
ValueCountFrequency (%)
38
20.4%
37
19.9%
34
18.3%
32
17.2%
11
 
5.9%
10
 
5.4%
2
 
1.1%
1
 
0.5%
1
 
0.5%
1
 
0.5%
Other values (19) 19
10.2%
Bengali
ValueCountFrequency (%)
29
15.3%
17
 
8.9%
16
 
8.4%
14
 
7.4%
9
 
4.7%
9
 
4.7%
8
 
4.2%
ি 8
 
4.2%
7
 
3.7%
6
 
3.2%
Other values (25) 67
35.3%
Cyrillic
ValueCountFrequency (%)
а 21
 
9.3%
о 20
 
8.8%
г 20
 
8.8%
т 19
 
8.4%
р 16
 
7.0%
н 16
 
7.0%
и 13
 
5.7%
у 13
 
5.7%
к 11
 
4.8%
е 11
 
4.8%
Other values (15) 67
29.5%
Katakana
ValueCountFrequency (%)
13
17.8%
12
16.4%
12
16.4%
5
 
6.8%
5
 
6.8%
3
 
4.1%
3
 
4.1%
2
 
2.7%
2
 
2.7%
2
 
2.7%
Other values (12) 14
19.2%
Arabic
ValueCountFrequency (%)
و 11
22.0%
ي 10
20.0%
ت 7
14.0%
ب 6
12.0%
غ 2
 
4.0%
ل 2
 
4.0%
ج 2
 
4.0%
س 2
 
4.0%
ی 2
 
4.0%
ح 1
 
2.0%
Other values (5) 5
10.0%
None
ValueCountFrequency (%)
ο 8
 
11.4%
ñ 7
 
10.0%
λ 5
 
7.1%
α 5
 
7.1%
τ 5
 
7.1%
γ 3
 
4.3%
π 3
 
4.3%
υ 3
 
4.3%
ι 3
 
4.3%
ε 3
 
4.3%
Other values (24) 25
35.7%
Devanagari
ValueCountFrequency (%)
7
 
10.1%
7
 
10.1%
5
 
7.2%
5
 
7.2%
5
 
7.2%
4
 
5.8%
4
 
5.8%
4
 
5.8%
3
 
4.3%
3
 
4.3%
Other values (15) 22
31.9%
Punctuation
ValueCountFrequency (%)
3
60.0%
1
 
20.0%
1
 
20.0%
Hebrew
ValueCountFrequency (%)
ס 2
28.6%
ק 1
14.3%
י 1
14.3%
ר 1
14.3%
ת 1
14.3%
א 1
14.3%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Compat Jamo
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Gurmukhi
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

referralPath
Path

MISSING 

Distinct1475
Distinct (%)0.4%
Missing572712
Missing (%)63.4%
Memory size39.5 MiB
/
75523 
/yt/about/
71036 
/analytics/web/
14620 
/yt/about/tr/
14599 
/yt/about/vi/
 
13753
Other values (1470)
141410 

Length

Max length270
Median length227
Mean length12.654307
Min length1

Characters and Unicode

Total characters4187829
Distinct characters80
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique616 ?
Unique (%)0.2%

Sample

1st row/
2nd row/
3rd row/corp/google.com/study/incentives/working-with-perks
4th row/od/Things-To-Do-in-Silicon-Valley/fl/How-To-Visit-the-Googleplex-the-Google-Head-Office-in-Mountain-View.htm
5th row/od/Things-To-Do-in-Silicon-Valley/fl/How-To-Visit-the-Googleplex-the-Google-Head-Office-in-Mountain-View.htm

Common Values

ValueCountFrequency (%)
/ 75523
 
8.4%
/yt/about/ 71036
 
7.9%
/analytics/web/ 14620
 
1.6%
/yt/about/tr/ 14599
 
1.6%
/yt/about/vi/ 13753
 
1.5%
/yt/about/es-419/ 12735
 
1.4%
/yt/about/pt-BR/ 12003
 
1.3%
/yt/about/th/ 11430
 
1.3%
/yt/about/ru/ 11193
 
1.2%
/yt/about/es/ 7092
 
0.8%
Other values (1465) 86957
 
9.6%
(Missing) 572712
63.4%

Length

2024-02-20T07:41:05.147263image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
75523
22.8%
yt/about 71036
21.5%
analytics/web 14620
 
4.4%
yt/about/tr 14599
 
4.4%
yt/about/vi 13753
 
4.2%
yt/about/es-419 12735
 
3.8%
yt/about/pt-br 12003
 
3.6%
yt/about/th 11430
 
3.5%
yt/about/ru 11193
 
3.4%
yt/about/es 7092
 
2.1%
Other values (1455) 86957
26.3%

Most occurring characters

ValueCountFrequency (%)
/ 986276
23.6%
t 539889
12.9%
o 313010
 
7.5%
a 299566
 
7.2%
u 248905
 
5.9%
b 239785
 
5.7%
y 235075
 
5.6%
e 137264
 
3.3%
- 116586
 
2.8%
i 108143
 
2.6%
Other values (70) 963330
23.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2837830
67.8%
Other Punctuation 1002859
 
23.9%
Dash Punctuation 116586
 
2.8%
Uppercase Letter 113210
 
2.7%
Decimal Number 96366
 
2.3%
Connector Punctuation 19108
 
0.5%
Math Symbol 1862
 
< 0.1%
Other Letter 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 539889
19.0%
o 313010
11.0%
a 299566
10.6%
u 248905
8.8%
b 239785
8.4%
y 235075
8.3%
e 137264
 
4.8%
i 108143
 
3.8%
s 101202
 
3.6%
n 87445
 
3.1%
Other values (16) 527546
18.6%
Uppercase Letter
ValueCountFrequency (%)
B 20815
18.4%
G 13512
11.9%
R 13108
11.6%
T 10994
 
9.7%
V 6806
 
6.0%
H 5551
 
4.9%
W 4236
 
3.7%
I 4028
 
3.6%
D 3622
 
3.2%
M 3051
 
2.7%
Other values (16) 27487
24.3%
Decimal Number
ValueCountFrequency (%)
1 25651
26.6%
4 19573
20.3%
9 19453
20.2%
0 8132
 
8.4%
7 5765
 
6.0%
2 5212
 
5.4%
6 4462
 
4.6%
5 3207
 
3.3%
3 2764
 
2.9%
8 2147
 
2.2%
Other Letter
ValueCountFrequency (%)
س 2
25.0%
ف 1
12.5%
ي 1
12.5%
ل 1
12.5%
م 1
12.5%
ك 1
12.5%
ى 1
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 986276
98.3%
. 15578
 
1.6%
% 836
 
0.1%
, 124
 
< 0.1%
: 44
 
< 0.1%
@ 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 1829
98.2%
+ 32
 
1.7%
~ 1
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 116586
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19108
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2951040
70.5%
Common 1236781
29.5%
Arabic 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 539889
18.3%
o 313010
10.6%
a 299566
10.2%
u 248905
 
8.4%
b 239785
 
8.1%
y 235075
 
8.0%
e 137264
 
4.7%
i 108143
 
3.7%
s 101202
 
3.4%
n 87445
 
3.0%
Other values (42) 640756
21.7%
Common
ValueCountFrequency (%)
/ 986276
79.7%
- 116586
 
9.4%
1 25651
 
2.1%
4 19573
 
1.6%
9 19453
 
1.6%
_ 19108
 
1.5%
. 15578
 
1.3%
0 8132
 
0.7%
7 5765
 
0.5%
2 5212
 
0.4%
Other values (11) 15447
 
1.2%
Arabic
ValueCountFrequency (%)
س 2
25.0%
ف 1
12.5%
ي 1
12.5%
ل 1
12.5%
م 1
12.5%
ك 1
12.5%
ى 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4187821
> 99.9%
Arabic 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 986276
23.6%
t 539889
12.9%
o 313010
 
7.5%
a 299566
 
7.2%
u 248905
 
5.9%
b 239785
 
5.7%
y 235075
 
5.6%
e 137264
 
3.3%
- 116586
 
2.8%
i 108143
 
2.6%
Other values (63) 963322
23.0%
Arabic
ValueCountFrequency (%)
س 2
25.0%
ف 1
12.5%
ي 1
12.5%
ل 1
12.5%
م 1
12.5%
ك 1
12.5%
ى 1
12.5%
Common prefix/
Unique stems1474
Unique names515
Unique extensions23
Unique directories1124
Unique anchors1
ValueCountFrequency (%)
/ 75523
 
8.4%
/yt/about/ 71036
 
7.9%
/analytics/web/ 14620
 
1.6%
/yt/about/tr/ 14599
 
1.6%
/yt/about/vi/ 13753
 
1.5%
/yt/about/es-419/ 12735
 
1.4%
/yt/about/pt-BR/ 12003
 
1.3%
/yt/about/th/ 11430
 
1.3%
/yt/about/ru/ 11193
 
1.2%
/yt/about/es/ 7092
 
0.8%
Other values (1465) 86957
 
9.6%
(Missing) 572712
63.4%
ValueCountFrequency (%)
/ 75523
 
8.4%
/yt/about/ 71036
 
7.9%
/analytics/web/ 14620
 
1.6%
/yt/about/tr/ 14599
 
1.6%
/yt/about/vi/ 13753
 
1.5%
/yt/about/es-419/ 12735
 
1.4%
/yt/about/pt-BR/ 12003
 
1.3%
/yt/about/th/ 11430
 
1.3%
/yt/about/ru/ 11193
 
1.2%
/yt/about/es/ 7092
 
0.8%
Other values (1464) 86957
 
9.6%
(Missing) 572712
63.4%
ValueCountFrequency (%)
304574
33.7%
using-the-logo.html 5284
 
0.6%
How-To-Visit-the-Googleplex-the-Google-Head-Office-in-Mountain-View.htm 2056
 
0.2%
index.html 2023
 
0.2%
c10b14f9a69ff71b1b7a 1784
 
0.2%
inpage_launch 1638
 
0.2%
alphabet-google-discounts 1118
 
0.1%
2145 1064
 
0.1%
Where-can-I-buy-a-stuffed-Go-language-gopher-mascot-online 872
 
0.1%
mobile 812
 
0.1%
Other values (505) 9716
 
1.1%
(Missing) 572712
63.4%
ValueCountFrequency (%)
319771
35.4%
.html 7848
 
0.9%
.htm 2059
 
0.2%
.php 724
 
0.1%
.aspx 190
 
< 0.1%
.jhtml 145
 
< 0.1%
.jspa 80
 
< 0.1%
.jsp 73
 
< 0.1%
.pdf 15
 
< 0.1%
.lai 8
 
< 0.1%
Other values (13) 28
 
< 0.1%
(Missing) 572712
63.4%
ValueCountFrequency (%)
/ 82812
 
9.2%
/yt/about 71408
 
7.9%
/analytics/web 16258
 
1.8%
/yt/about/tr 14665
 
1.6%
/yt/about/vi 13788
 
1.5%
/yt/about/es-419 12832
 
1.4%
/yt/about/pt-BR 12077
 
1.3%
/yt/about/th 11471
 
1.3%
/yt/about/ru 11294
 
1.2%
/yt/about/es 7143
 
0.8%
Other values (1114) 77193
 
8.5%
(Missing) 572712
63.4%
ValueCountFrequency (%)
330941
36.6%
(Missing) 572712
63.4%

adwordsClickInfo.page
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED 

Distinct8
Distinct (%)< 0.1%
Missing882193
Missing (%)97.6%
Infinite0
Infinite (%)0.0%
Mean1.0081081
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.9 MiB
2024-02-20T07:41:05.205281image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum14
Range13
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.17358392
Coefficient of variation (CV)0.1721878
Kurtosis2188.361
Mean1.0081081
Median Absolute Deviation (MAD)0
Skewness40.170902
Sum21634
Variance0.030131376
MonotonicityNot monotonic
2024-02-20T07:41:05.255542image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 21362
 
2.4%
2 73
 
< 0.1%
3 10
 
< 0.1%
5 7
 
< 0.1%
7 3
 
< 0.1%
9 2
 
< 0.1%
4 2
 
< 0.1%
14 1
 
< 0.1%
(Missing) 882193
97.6%
ValueCountFrequency (%)
1 21362
2.4%
2 73
 
< 0.1%
3 10
 
< 0.1%
4 2
 
< 0.1%
5 7
 
< 0.1%
7 3
 
< 0.1%
9 2
 
< 0.1%
14 1
 
< 0.1%
ValueCountFrequency (%)
14 1
 
< 0.1%
9 2
 
< 0.1%
7 3
 
< 0.1%
5 7
 
< 0.1%
4 2
 
< 0.1%
3 10
 
< 0.1%
2 73
 
< 0.1%
1 21362
2.4%

adwordsClickInfo.slot
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing882193
Missing (%)97.6%
Memory size34.9 MiB
Top
20956 
RHS
 
504

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters64380
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTop
2nd rowTop
3rd rowTop
4th rowTop
5th rowTop

Common Values

ValueCountFrequency (%)
Top 20956
 
2.3%
RHS 504
 
0.1%
(Missing) 882193
97.6%

Length

2024-02-20T07:41:05.310529image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:05.367692image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
top 20956
97.7%
rhs 504
 
2.3%

Most occurring characters

ValueCountFrequency (%)
T 20956
32.6%
o 20956
32.6%
p 20956
32.6%
R 504
 
0.8%
H 504
 
0.8%
S 504
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 41912
65.1%
Uppercase Letter 22468
34.9%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 20956
93.3%
R 504
 
2.2%
H 504
 
2.2%
S 504
 
2.2%
Lowercase Letter
ValueCountFrequency (%)
o 20956
50.0%
p 20956
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 64380
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 20956
32.6%
o 20956
32.6%
p 20956
32.6%
R 504
 
0.8%
H 504
 
0.8%
S 504
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64380
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
T 20956
32.6%
o 20956
32.6%
p 20956
32.6%
R 504
 
0.8%
H 504
 
0.8%
S 504
 
0.8%
Distinct17774
Distinct (%)82.4%
Missing882092
Missing (%)97.6%
Memory size29.5 MiB
2024-02-20T07:41:05.463740image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Length

Max length92
Median length91
Mean length69.235703
Min length26

Characters and Unicode

Total characters1492791
Distinct characters64
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15688 ?
Unique (%)72.8%

Sample

1st rowCj0KEQjwxqS-BRDRgPLp0q2t0IUBEiQAgfMXRBVDYwnFawcmsrhs02pjO7FXPLhzHyvJFv53h1H4QJ8aAhtw8P8HAQ
2nd rowCj0KEQjwxqS-BRDRgPLp0q2t0IUBEiQAgfMXRAq0D2zir1iAiqwgFU0lcMGVY6qaqhBTOFSAIW7gM8saAiku8P8HAQ
3rd rowCj0KEQjwxqS-BRDRgPLp0q2t0IUBEiQAgfMXRMbhgNCALey5pPeCxitqlWsaKLtXW_EC8qRLRreq6OMaApJJ8P8HAQ
4th rowCj0KEQjwxqS-BRDRgPLp0q2t0IUBEiQAgfMXRBRI7rtb79aCyB-UUNNHh1V712wows-T-MlL9VW-8ZEaAhqd8P8HAQ
5th rowCj0KEQjwxqS-BRDRgPLp0q2t0IUBEiQAgfMXRDKcQOTkfRji3NxEErk_rDSPqc8VzHFSZnRcZBCoBOgaAgeG8P8HAQ
ValueCountFrequency (%)
cj0keqjwmirjbrcrmj_x7kdo-9obeiqauupkmufmpug3zdwyo8gtsjibfd5mphstza9y_9ncri8x97oaaglc8p8haq 70
 
0.3%
cj0keqjw1ee_brd3hk6x993yzeobeiqa5rh_bea562m9tvl_mtnafvtdndqoqrp1rvxmmgwjcx1lafwaaj4o8p8haq 41
 
0.2%
cjh1vbf94m8cfuelgqodyakhgq 29
 
0.1%
cj0keqiaw_debrchnyiq_562gsebeiqa4lcssmb_rwgvppnltzlzj5rgwqx5lk87wc5cjfcqznenzewaaiap8p8haq 27
 
0.1%
cjwkeaiaj7tcbrcp2z22ue-zrj4sjacg7sbejui6ggr6oca-edc2-lx7w1m5ia1c_qnbzwzvtquanxocb5rw_wcb 24
 
0.1%
cn_u9pavhdacfcnahgodtcqajw 22
 
0.1%
cjwkeaiaxkrfbrdm25f60oegtwwsjabgec-z0_dlpcxhm1ztqlr1ywewxu875yaqwupt7pgmgfezthoceezw_wcb 21
 
0.1%
cnhp7nf2ytmcfvlwdqod_iol5a 20
 
0.1%
cjwkeaiavs7cbrc24rao6bgcoiasjabact5dtalfxcossvr2e2aduhx6z6oe0kauvtqkzl-bcvn1-hocnlrw_wcb 20
 
0.1%
cj6xtee6j9acfqqdfgod8tkdnw 18
 
0.1%
Other values (17764) 21269
98.6%
2024-02-20T07:41:05.648600image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 71957
 
4.8%
C 62703
 
4.2%
w 52857
 
3.5%
B 47608
 
3.2%
E 42671
 
2.9%
Q 42294
 
2.8%
j 40337
 
2.7%
K 32775
 
2.2%
o 31969
 
2.1%
R 30844
 
2.1%
Other values (54) 1036776
69.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 686721
46.0%
Lowercase Letter 564972
37.8%
Decimal Number 193611
 
13.0%
Connector Punctuation 28864
 
1.9%
Dash Punctuation 18623
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 71957
 
10.5%
C 62703
 
9.1%
B 47608
 
6.9%
E 42671
 
6.2%
Q 42294
 
6.2%
K 32775
 
4.8%
R 30844
 
4.5%
D 26904
 
3.9%
I 24996
 
3.6%
P 23785
 
3.5%
Other values (16) 280184
40.8%
Lowercase Letter
ValueCountFrequency (%)
w 52857
 
9.4%
j 40337
 
7.1%
o 31969
 
5.7%
i 28639
 
5.1%
a 24959
 
4.4%
c 24188
 
4.3%
g 24027
 
4.3%
d 21898
 
3.9%
s 20665
 
3.7%
h 20152
 
3.6%
Other values (16) 275281
48.7%
Decimal Number
ValueCountFrequency (%)
8 30621
15.8%
0 24046
12.4%
9 18500
9.6%
4 18386
9.5%
7 18118
9.4%
6 17972
9.3%
3 17049
8.8%
2 16457
8.5%
5 16391
8.5%
1 16071
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 28864
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18623
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1251693
83.8%
Common 241098
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 71957
 
5.7%
C 62703
 
5.0%
w 52857
 
4.2%
B 47608
 
3.8%
E 42671
 
3.4%
Q 42294
 
3.4%
j 40337
 
3.2%
K 32775
 
2.6%
o 31969
 
2.6%
R 30844
 
2.5%
Other values (42) 795678
63.6%
Common
ValueCountFrequency (%)
8 30621
12.7%
_ 28864
12.0%
0 24046
10.0%
- 18623
7.7%
9 18500
7.7%
4 18386
7.6%
7 18118
7.5%
6 17972
7.5%
3 17049
7.1%
2 16457
6.8%
Other values (2) 32462
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1492791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 71957
 
4.8%
C 62703
 
4.2%
w 52857
 
3.5%
B 47608
 
3.2%
E 42671
 
2.9%
Q 42294
 
2.8%
j 40337
 
2.7%
K 32775
 
2.2%
o 31969
 
2.1%
R 30844
 
2.1%
Other values (54) 1036776
69.5%

adwordsClickInfo.adNetworkType
Categorical

HIGH CORRELATION  IMBALANCE  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing882193
Missing (%)97.6%
Memory size35.1 MiB
Google Search
21453 
Search partners
 
7

Length

Max length15
Median length13
Mean length13.000652
Min length13

Characters and Unicode

Total characters278994
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGoogle Search
2nd rowGoogle Search
3rd rowGoogle Search
4th rowGoogle Search
5th rowGoogle Search

Common Values

ValueCountFrequency (%)
Google Search 21453
 
2.4%
Search partners 7
 
< 0.1%
(Missing) 882193
97.6%

Length

2024-02-20T07:41:05.720578image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:05.787020image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
search 21460
50.0%
google 21453
50.0%
partners 7
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 42920
15.4%
o 42906
15.4%
r 21474
7.7%
a 21467
7.7%
21460
7.7%
S 21460
7.7%
c 21460
7.7%
h 21460
7.7%
G 21453
7.7%
g 21453
7.7%
Other values (5) 21481
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 214621
76.9%
Uppercase Letter 42913
 
15.4%
Space Separator 21460
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 42920
20.0%
o 42906
20.0%
r 21474
10.0%
a 21467
10.0%
c 21460
10.0%
h 21460
10.0%
g 21453
10.0%
l 21453
10.0%
p 7
 
< 0.1%
t 7
 
< 0.1%
Other values (2) 14
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
S 21460
50.0%
G 21453
50.0%
Space Separator
ValueCountFrequency (%)
21460
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 257534
92.3%
Common 21460
 
7.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 42920
16.7%
o 42906
16.7%
r 21474
8.3%
a 21467
8.3%
S 21460
8.3%
c 21460
8.3%
h 21460
8.3%
G 21453
8.3%
g 21453
8.3%
l 21453
8.3%
Other values (4) 28
 
< 0.1%
Common
ValueCountFrequency (%)
21460
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 278994
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 42920
15.4%
o 42906
15.4%
r 21474
7.7%
a 21467
7.7%
21460
7.7%
S 21460
7.7%
c 21460
7.7%
h 21460
7.7%
G 21453
7.7%
g 21453
7.7%
Other values (5) 21481
7.7%

adContent
Categorical

HIGH CORRELATION  MISSING 

Distinct44
Distinct (%)0.4%
Missing892707
Missing (%)98.8%
Memory size34.9 MiB
Google Merchandise Collection
5122 
Google Online Store
1245 
Display Ad created 3/11/14
967 
Full auto ad IMAGE ONLY
822 
Ad from 12/13/16
610 
Other values (39)
2180 

Length

Max length43
Median length34
Mean length25.126622
Min length8

Characters and Unicode

Total characters275036
Distinct characters63
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)< 0.1%

Sample

1st rowFull auto ad IMAGE ONLY
2nd rowFirst Full Auto Template Test Ad
3rd row{KeyWord:Google Brand Items}
4th rowFull auto ad IMAGE ONLY
5th rowFull auto ad IMAGE ONLY

Common Values

ValueCountFrequency (%)
Google Merchandise Collection 5122
 
0.6%
Google Online Store 1245
 
0.1%
Display Ad created 3/11/14 967
 
0.1%
Full auto ad IMAGE ONLY 822
 
0.1%
Ad from 12/13/16 610
 
0.1%
Ad from 11/3/16 489
 
0.1%
Display Ad created 3/11/15 392
 
< 0.1%
{KeyWord:Google Brand Items} 251
 
< 0.1%
{KeyWord:Google Merchandise} 155
 
< 0.1%
Ad from 11/7/16 123
 
< 0.1%
Other values (34) 770
 
0.1%
(Missing) 892707
98.8%

Length

2024-02-20T07:41:05.846341image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
google 6657
18.6%
merchandise 5368
15.0%
collection 5122
14.3%
ad 3572
10.0%
display 1409
 
3.9%
created 1409
 
3.9%
store 1252
 
3.5%
online 1245
 
3.5%
from 1225
 
3.4%
3/11/14 967
 
2.7%
Other values (60) 7620
21.3%

Most occurring characters

ValueCountFrequency (%)
e 29938
 
10.9%
o 29217
 
10.6%
24900
 
9.1%
l 22124
 
8.0%
n 13612
 
4.9%
i 13600
 
4.9%
c 12045
 
4.4%
d 11551
 
4.2%
a 10739
 
3.9%
r 10693
 
3.9%
Other values (53) 96617
35.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 192416
70.0%
Uppercase Letter 35962
 
13.1%
Space Separator 24900
 
9.1%
Decimal Number 14087
 
5.1%
Other Punctuation 6172
 
2.2%
Open Punctuation 677
 
0.2%
Close Punctuation 677
 
0.2%
Connector Punctuation 110
 
< 0.1%
Dash Punctuation 35
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 29938
15.6%
o 29217
15.2%
l 22124
11.5%
n 13612
7.1%
i 13600
7.1%
c 12045
6.3%
d 11551
 
6.0%
a 10739
 
5.6%
r 10693
 
5.6%
t 9533
 
5.0%
Other values (12) 29364
15.3%
Uppercase Letter
ValueCountFrequency (%)
G 8112
22.6%
M 6221
17.3%
C 5126
14.3%
A 3652
10.2%
O 2096
 
5.8%
D 1482
 
4.1%
S 1400
 
3.9%
I 1083
 
3.0%
F 1031
 
2.9%
L 996
 
2.8%
Other values (12) 4763
13.2%
Decimal Number
ValueCountFrequency (%)
1 8015
56.9%
3 2458
 
17.4%
6 1222
 
8.7%
4 1017
 
7.2%
2 688
 
4.9%
5 433
 
3.1%
7 179
 
1.3%
0 75
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 5276
85.5%
: 677
 
11.0%
? 111
 
1.8%
% 75
 
1.2%
' 31
 
0.5%
! 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
24900
100.0%
Open Punctuation
ValueCountFrequency (%)
{ 677
100.0%
Close Punctuation
ValueCountFrequency (%)
} 677
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 110
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 228378
83.0%
Common 46658
 
17.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 29938
13.1%
o 29217
12.8%
l 22124
 
9.7%
n 13612
 
6.0%
i 13600
 
6.0%
c 12045
 
5.3%
d 11551
 
5.1%
a 10739
 
4.7%
r 10693
 
4.7%
t 9533
 
4.2%
Other values (34) 65326
28.6%
Common
ValueCountFrequency (%)
24900
53.4%
1 8015
 
17.2%
/ 5276
 
11.3%
3 2458
 
5.3%
6 1222
 
2.6%
4 1017
 
2.2%
2 688
 
1.5%
{ 677
 
1.5%
: 677
 
1.5%
} 677
 
1.5%
Other values (9) 1051
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 275036
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 29938
 
10.9%
o 29217
 
10.6%
24900
 
9.1%
l 22124
 
8.0%
n 13612
 
4.9%
i 13600
 
4.9%
c 12045
 
4.4%
d 11551
 
4.2%
a 10739
 
3.9%
r 10693
 
3.9%
Other values (53) 96617
35.1%

conversion
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size50.0 MiB
0
892138 
1
 
11515

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters903653
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 892138
98.7%
1 11515
 
1.3%

Length

2024-02-20T07:41:05.905430image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-02-20T07:41:05.959978image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
ValueCountFrequency (%)
0 892138
98.7%
1 11515
 
1.3%

Most occurring characters

ValueCountFrequency (%)
0 892138
98.7%
1 11515
 
1.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 903653
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 892138
98.7%
1 11515
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common 903653
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 892138
98.7%
1 11515
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 903653
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 892138
98.7%
1 11515
 
1.3%

Interactions

2024-02-20T07:40:48.961113image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:44.851059image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.651748image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.436809image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.120932image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.794079image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.499389image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:49.034952image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.005351image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.788685image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.545824image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.227126image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.922050image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.565526image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:49.108624image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.158408image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.908729image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.651722image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.334765image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.053495image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.632269image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:49.190106image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.266142image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.036093image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.765194image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.435618image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.180694image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.700446image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:49.261182image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.392582image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.175966image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.878095image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.548786image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.296614image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.767446image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:49.337132image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.463826image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.250086image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.944050image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.611209image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.360953image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.830865image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:49.406438image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:45.533750image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:46.325531image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.015890image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:47.675699image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.426442image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
2024-02-20T07:40:48.895209image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/

Correlations

2024-02-20T07:41:06.014381image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
adContentadwordsClickInfo.adNetworkTypeadwordsClickInfo.pageadwordsClickInfo.slotcampaignchannelGroupingcontinentconversiondeviceCategoryhitsisMobilemediumoperatingSystempageviewssubContinenttransactionRevenuevisitIdvisitNumbervisitStartTime
adContent1.0000.4310.0760.3300.9790.9980.3270.0760.2620.1730.3210.9980.1740.1810.201-0.1270.559-0.2980.559
adwordsClickInfo.adNetworkType0.4311.000-0.0010.1080.3361.0000.0000.0000.000-0.0210.0001.0000.000-0.0210.000NaN-0.0070.002-0.007
adwordsClickInfo.page0.076-0.0011.0000.1860.0051.0000.0000.0000.004-0.0320.0131.0000.000-0.0320.000NaN-0.003-0.008-0.003
adwordsClickInfo.slot0.3300.1080.1861.0000.0981.0000.0130.0090.1090.0820.0591.0000.0700.0840.0000.0050.0150.0450.015
campaign0.9790.3360.0050.0981.0000.5160.0690.0190.0680.0410.0940.5570.0390.0410.060-0.0410.0300.0340.030
channelGrouping0.9981.0001.0001.0000.5161.0000.1880.1310.218-0.1130.3071.0000.169-0.1110.2220.031-0.226-0.088-0.226
continent0.3270.0000.0000.0130.0690.1881.0000.1090.063-0.2260.0690.1310.157-0.2271.0000.002-0.011-0.165-0.011
conversion0.0760.0000.0000.0090.0190.1310.1091.0000.0450.1960.0450.0350.0890.1990.123NaN0.0110.1090.011
deviceCategory0.2620.0000.0040.1090.0680.2180.0630.0451.000-0.0150.9990.2160.715-0.0170.129-0.1760.144-0.0280.144
hits0.173-0.021-0.0320.0820.041-0.113-0.2260.196-0.0151.0000.0240.0070.0160.9920.0220.295-0.0000.115-0.000
isMobile0.3210.0000.0130.0590.0940.3070.0690.0450.9990.0241.0000.3040.994-0.0180.167-0.1760.146-0.0290.146
medium0.9981.0001.0001.0000.5571.0000.1310.0350.2160.0070.3041.0000.145-0.0720.1690.034-0.198-0.053-0.198
operatingSystem0.1740.0000.0000.0700.0390.1690.1570.0890.7150.0160.9940.1451.000-0.0680.116-0.0670.001-0.0740.001
pageviews0.181-0.021-0.0320.0840.041-0.111-0.2270.199-0.0170.992-0.018-0.072-0.0681.0000.0160.2700.0050.1140.005
subContinent0.2010.0000.0000.0000.0600.2221.0000.1230.1290.0220.1670.1690.1160.0161.0000.027-0.047-0.110-0.047
transactionRevenue-0.127NaNNaN0.005-0.0410.0310.002NaN-0.1760.295-0.1760.034-0.0670.2700.0271.000-0.0690.218-0.069
visitId0.559-0.007-0.0030.0150.030-0.226-0.0110.0110.144-0.0000.146-0.1980.0010.005-0.047-0.0691.0000.0411.000
visitNumber-0.2980.002-0.0080.0450.034-0.088-0.1650.109-0.0280.115-0.029-0.053-0.0740.114-0.1100.2180.0411.0000.041
visitStartTime0.559-0.007-0.0030.0150.030-0.226-0.0110.0110.144-0.0000.146-0.1980.0010.005-0.047-0.0691.0000.0411.000

Missing values

2024-02-20T07:40:50.987476image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-02-20T07:40:53.695803image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-02-20T07:40:58.197836image/svg+xmlMatplotlib v3.7.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

channelGroupingdatefullVisitorIdsessionIdvisitIdvisitNumbervisitStartTimecontinentsubContinentcountryregionmetrocitynetworkDomainbrowseroperatingSystemisMobiledeviceCategoryhitspageviewstransactionRevenuecampaignsourcemediumkeywordreferralPathadwordsClickInfo.pageadwordsClickInfo.slotadwordsClickInfo.gclIdadwordsClickInfo.adNetworkTypeadContentconversion
0Organic Search1970-01-01 00:00:00.02016090211316604407859685031131660440785968503_1472830385147283038511472830385AsiaWestern AsiaTurkeyIzmir(not set)Izmirttnet.com.trChromeWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
1Organic Search1970-01-01 00:00:00.020160902377306020877927890377306020877927890_1472880147147288014711472880147OceaniaAustralasiaAustralianot available in demo datasetnot available in demo datasetnot available in demo datasetdodo.net.auFirefoxMacintoshFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
2Organic Search1970-01-01 00:00:00.02016090238955462635097745833895546263509774583_1472865386147286538611472865386EuropeSouthern EuropeSpainCommunity of Madrid(not set)Madridunknown.unknownChromeWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
3Organic Search1970-01-01 00:00:00.02016090247634471614044455954763447161404445595_1472881213147288121311472881213AsiaSoutheast AsiaIndonesianot available in demo datasetnot available in demo datasetnot available in demo datasetunknown.unknownUC BrowserLinuxFalsedesktop11.0NaN(not set)googleorganicgoogle + onlineNaNNaNNaNNaNNaNNaN0
4Organic Search1970-01-01 00:00:00.0201609022729443790973208527294437909732085_1472822600147282260021472822600EuropeNorthern EuropeUnited Kingdomnot available in demo datasetnot available in demo datasetnot available in demo datasetunknown.unknownChromeAndroidTruemobile11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
5Organic Search1970-01-01 00:00:00.02016090229389431836566356532938943183656635653_1472807194147280719411472807194EuropeSouthern EuropeItalynot available in demo datasetnot available in demo datasetnot available in demo datasetfastwebnet.itChromeWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
6Organic Search1970-01-01 00:00:00.02016090219056720392424608971905672039242460897_1472817241147281724111472817241AsiaSouthern AsiaPakistannot available in demo datasetnot available in demo datasetnot available in demo datasetunknown.unknownChromeWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
7Organic Search1970-01-01 00:00:00.020160902537222803633850821537222803633850821_1472812602147281260211472812602OceaniaAustralasiaAustraliaQueensland(not set)Brisbanebigpond.net.auChromeWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
8Organic Search1970-01-01 00:00:00.02016090244454548118314004144445454811831400414_1472805784147280578411472805784EuropeWestern EuropeAustrianot available in demo datasetnot available in demo datasetnot available in demo datasetspar.atInternet ExplorerWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
9Organic Search1970-01-01 00:00:00.02016090294997852594122403429499785259412240342_1472812272147281227211472812272EuropeWestern EuropeNetherlandsnot available in demo datasetnot available in demo datasetnot available in demo datasetchello.nlFirefoxWindowsFalsedesktop11.0NaN(not set)googleorganic(not provided)NaNNaNNaNNaNNaNNaN0
channelGroupingdatefullVisitorIdsessionIdvisitIdvisitNumbervisitStartTimecontinentsubContinentcountryregionmetrocitynetworkDomainbrowseroperatingSystemisMobiledeviceCategoryhitspageviewstransactionRevenuecampaignsourcemediumkeywordreferralPathadwordsClickInfo.pageadwordsClickInfo.slotadwordsClickInfo.gclIdadwordsClickInfo.adNetworkTypeadContentconversion
903643Social1970-01-01 00:00:00.02017010456144239667131330056144239667131330_1483600246148360024641483600246AmericasNorthern AmericaUnited StatesCaliforniaSan Francisco-Oakland-San Jose CAFremont(not set)ChromeMacintoshFalsedesktop1110.0NaN(not set)groups.google.comreferralNaN/a/google.com/forum/NaNNaNNaNNaNNaN0
903644Social1970-01-01 00:00:00.020170104256641390199759052256641390199759052_1483556333148355633311483556333AmericasNorthern AmericaUnited Statesnot available in demo datasetnot available in demo datasetnot available in demo datasetrr.comChromeiOSTruetablet117.0NaN(not set)m.youtube.comreferralNaN/watchNaNNaNNaNNaNNaN0
903645Social1970-01-01 00:00:00.02017010420350956320748350752035095632074835075_1483570454148357045411483570454AmericasNorthern AmericaUnited StatesNew YorkNew York NYNew Yorkatt.netSafari (in-app)iOSTruemobile118.0NaN(not set)m.facebook.comreferralNaN/NaNNaNNaNNaNNaN0
903646Social1970-01-01 00:00:00.020170104567297396362985009567297396362985009_1483581760148358176011483581760OceaniaAustralasiaAustraliaVictoria(not set)Melbourneoptusnet.com.auChromeiOSTruetablet1512.0NaN(not set)youtube.comreferralNaN/yt/about/NaNNaNNaNNaNNaN0
903647Social1970-01-01 00:00:00.02017010421401499743393162332140149974339316233_1483557808148355780811483557808AfricaNorthern AfricaEgyptnot available in demo datasetnot available in demo datasetnot available in demo datasettedata.netChromeWindowsFalsedesktop1611.0NaN(not set)youtube.comreferralNaN/yt/about/ar/NaNNaNNaNNaNNaN0
903648Social1970-01-01 00:00:00.02017010451237791003075003325123779100307500332_1483554750148355475011483554750AmericasCaribbeanPuerto Riconot available in demo datasetnot available in demo datasetnot available in demo datasetprtc.netChromeWindowsFalsedesktop1715.0NaN(not set)youtube.comreferralNaN/yt/about/NaNNaNNaNNaNNaN0
903649Social1970-01-01 00:00:00.02017010472317289649739598427231728964973959842_1483543798148354379811483543798AsiaSouthern AsiaSri Lankanot available in demo datasetnot available in demo datasetnot available in demo datasetunknown.unknownChromeAndroidTruemobile1813.0NaN(not set)youtube.comreferralNaN/yt/about/NaNNaNNaNNaNNaN0
903650Social1970-01-01 00:00:00.02017010457445766323964068995744576632396406899_1483526434148352643411483526434AsiaEastern AsiaSouth KoreaSeoul(not set)Seoulunknown.unknownAndroid WebviewAndroidTruemobile2421.0NaN(not set)youtube.comreferralNaN/yt/about/ko/NaNNaNNaNNaNNaN0
903651Social1970-01-01 00:00:00.02017010427093554559917507752709355455991750775_1483592857148359285711483592864AsiaSoutheast AsiaIndonesianot available in demo datasetnot available in demo datasetnot available in demo datasetunknown.unknownChromeWindowsFalsedesktop2422.0NaN(not set)facebook.comreferralNaN/l.phpNaNNaNNaNNaNNaN0
903652Social1970-01-01 00:00:00.0201701048149001636178050530814900163617805053_1483574474148357447411483574474AmericasCentral AmericaMexiconot available in demo datasetnot available in demo datasetnot available in demo datasetcybercable.net.mxChromeAndroidTruemobile3131.0NaN(not set)youtube.comreferralNaN/yt/about/es-419/NaNNaNNaNNaNNaN0